Python Python 3.0 , encoding="utf-8" for encryption

  • Thread starter Thread starter NihalRi
  • Start date Start date
  • Tags Tags
    Encryption Python
Click For Summary
Upgrading from Python 2 to Python 3 can lead to issues with string handling, particularly when encrypting text files using Unicode values. The original code, which uses `ord()` and `chr()` methods to manipulate characters, encounters a `UnicodeEncodeError` due to the differences in how Python 3 handles strings and bytes. The error indicates that the adjusted string contains invalid UTF-8 characters. To resolve this, it's recommended to encode the string into a byte string using UTF-8 before performing any encryption operations. This allows for proper handling of byte values, ensuring that all ordinals remain within the 0-255 range. After encryption, the byte string should be decoded back into Unicode. Additionally, for serious encryption needs, it's advised to use established libraries like bcrypt instead of implementing custom encryption methods, which can lead to security vulnerabilities.
NihalRi
Messages
134
Reaction score
12
I had recently upgraded my version of python from 2 to 3. I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods. This does not seem to work with python 3 and I was wondering how I could use encoding="utf-8" to make it work without altering too much of my code. Below is what my code looks like.for x in range(0, len(content)):
output = chr(ord(content[x])+ord(key[x])%256)
crypt = crypt + output

new.write(crypt)Ther error I would get isTraceback (most recent call last):
File "C:\Users\\Desktop\programing\Python Files\Encryptor and Decryptor\encryption.py", line 26, in <module>
new.write(crypt)
File "C:\Users\\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x93' in position 0: character maps to <undefined>
 
Technology news on Phys.org
It looks like you’ve made an invalid Utf-8 character in you adjusted string of text. Utf-8 is a multibyte character with a specific bit pattern format.

So you need to change your scheme

https://en.m.wikipedia.org/wiki/UTF-8
 
  • Like
Likes NihalRi
NihalRi said:
I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since you are combining the content and the key modulo 256, I strongly suspect that your Python 2 program was converting a character to its byte ordinal value, i.e., that you were operating on Python 2 strings, not Python 2 unicode objects. If you really want to operate on unicode objects (which I don't recommend for encryption/decryption--see below), you cannot assume that your ordinals (the integers you operate on before converting back to strings) will be 255 or less.

For encrypting Unicode, you shouldn't operate directly on the Unicode anyway. You should encode it into a byte string (a Python 3 "bytes" object) using your chosen encoding (utf-8 in this case), then encrypt the byte string (where now you can combine the byte and the key modulo 256 without a problem since all ordinals will be 255 or less). When you want to decrypt, you decrypt the byte string, then decode it into Unicode using your chosen encoding.
 
  • Like
Likes jedishrfu
NihalRi said:
I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since we're talking about encryption, I should probably also say that if you're trying to do encryption yourself for anything other than a private hobby program or a class assignment, you shouldn't. You should use available encryption libraries written by people who are experts at avoiding all of the many, many pitfalls that lurk in the world of encryption. For Python programs that need encryption, I have found the bcrypt Python library to be a good choice.
 
  • Like
Likes FactChecker and jedishrfu
Learn If you want to write code for Python Machine learning, AI Statistics/data analysis Scientific research Web application servers Some microcontrollers JavaScript/Node JS/TypeScript Web sites Web application servers C# Games (Unity) Consumer applications (Windows) Business applications C++ Games (Unreal Engine) Operating systems, device drivers Microcontrollers/embedded systems Consumer applications (Linux) Some more tips: Do not learn C++ (or any other dialect of C) as a...

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K