Python 3.0 , encoding="utf-8" for encryption

  • Context: Python 
  • Thread starter Thread starter NihalRi
  • Start date Start date
  • Tags Tags
    Encryption Python
Click For Summary

Discussion Overview

The discussion centers around the challenges of adapting a text encryption program from Python 2 to Python 3, specifically regarding the handling of character encoding and the use of Unicode values. Participants explore the implications of using UTF-8 encoding in the context of encryption, as well as potential pitfalls in DIY encryption methods.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Conceptual clarification

Main Points Raised

  • One participant describes their previous method of encryption using ord() and chr() in Python 2, which is causing issues in Python 3 due to changes in string handling and encoding.
  • Another participant suggests that an invalid UTF-8 character may have been created in the adjusted string, indicating a need to change the encryption scheme.
  • A different participant notes that the original program likely operated on byte ordinal values in Python 2 and emphasizes the importance of encoding the string into a byte string in Python 3 before performing encryption.
  • One participant warns against DIY encryption for anything beyond personal projects and recommends using established encryption libraries, citing bcrypt as a reliable option.

Areas of Agreement / Disagreement

Participants express differing views on the best approach to handle encryption in Python 3, with no consensus on a single method or solution. There is also a general agreement on the risks associated with implementing encryption without proper expertise.

Contextual Notes

Participants highlight the importance of understanding the differences between string types in Python 2 and Python 3, particularly regarding Unicode and byte strings. The discussion does not resolve the specific implementation issues faced by the original poster.

NihalRi
Messages
134
Reaction score
12
I had recently upgraded my version of python from 2 to 3. I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods. This does not seem to work with python 3 and I was wondering how I could use encoding="utf-8" to make it work without altering too much of my code. Below is what my code looks like.for x in range(0, len(content)):
output = chr(ord(content[x])+ord(key[x])%256)
crypt = crypt + output

new.write(crypt)Ther error I would get isTraceback (most recent call last):
File "C:\Users\\Desktop\programing\Python Files\Encryptor and Decryptor\encryption.py", line 26, in <module>
new.write(crypt)
File "C:\Users\\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x93' in position 0: character maps to <undefined>
 
Technology news on Phys.org
It looks like you’ve made an invalid Utf-8 character in you adjusted string of text. Utf-8 is a multibyte character with a specific bit pattern format.

So you need to change your scheme

https://en.m.wikipedia.org/wiki/UTF-8
 
  • Like
Likes   Reactions: NihalRi
NihalRi said:
I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since you are combining the content and the key modulo 256, I strongly suspect that your Python 2 program was converting a character to its byte ordinal value, i.e., that you were operating on Python 2 strings, not Python 2 unicode objects. If you really want to operate on unicode objects (which I don't recommend for encryption/decryption--see below), you cannot assume that your ordinals (the integers you operate on before converting back to strings) will be 255 or less.

For encrypting Unicode, you shouldn't operate directly on the Unicode anyway. You should encode it into a byte string (a Python 3 "bytes" object) using your chosen encoding (utf-8 in this case), then encrypt the byte string (where now you can combine the byte and the key modulo 256 without a problem since all ordinals will be 255 or less). When you want to decrypt, you decrypt the byte string, then decode it into Unicode using your chosen encoding.
 
  • Like
Likes   Reactions: jedishrfu
NihalRi said:
I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since we're talking about encryption, I should probably also say that if you're trying to do encryption yourself for anything other than a private hobby program or a class assignment, you shouldn't. You should use available encryption libraries written by people who are experts at avoiding all of the many, many pitfalls that lurk in the world of encryption. For Python programs that need encryption, I have found the bcrypt Python library to be a good choice.
 
  • Like
Likes   Reactions: FactChecker and jedishrfu

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K