Python 3.0 , encoding="utf-8" for encryption

NihalRi · Nov 18, 2018

I had recently upgraded my version of python from 2 to 3. I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods. This does not seem to work with python 3 and I was wondering how I could use encoding="utf-8" to make it work without altering too much of my code. Below is what my code looks like.for x in range(0, len(content)):
output = chr(ord(content[x])+ord(key[x])%256)
crypt = crypt + output

new.write(crypt)Ther error I would get isTraceback (most recent call last):
File "C:\Users\\Desktop\programing\Python Files\Encryptor and Decryptor\encryption.py", line 26, in <module>
new.write(crypt)
File "C:\Users\\AppData\Local\Programs\Python\Python37-32\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\x93' in position 0: character maps to <undefined>

jedishrfu · Nov 18, 2018

It looks like you’ve made an invalid Utf-8 character in you adjusted string of text. Utf-8 is a multibyte character with a specific bit pattern format.

So you need to change your scheme

https://en.m.wikipedia.org/wiki/UTF-8

PeterDonis · Nov 18, 2018

NihalRi said:

I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since you are combining the content and the key modulo 256, I strongly suspect that your Python 2 program was converting a character to its byte ordinal value, i.e., that you were operating on Python 2 strings, not Python 2 unicode objects. If you really want to operate on unicode objects (which I don't recommend for encryption/decryption--see below), you cannot assume that your ordinals (the integers you operate on before converting back to strings) will be 255 or less.

For encrypting Unicode, you shouldn't operate directly on the Unicode anyway. You should encode it into a byte string (a Python 3 "bytes" object) using your chosen encoding (utf-8 in this case), then encrypt the byte string (where now you can combine the byte and the key modulo 256 without a problem since all ordinals will be 255 or less). When you want to decrypt, you decrypt the byte string, then decode it into Unicode using your chosen encoding.

PeterDonis · Nov 18, 2018

NihalRi said:

I had a program that encrypted a text file by converting a character to its Unicode value, altering it and then changing it back to a character using the ord() and chr() methods.

Since we're talking about encryption, I should probably also say that if you're trying to do encryption yourself for anything other than a private hobby program or a class assignment, you shouldn't. You should use available encryption libraries written by people who are experts at avoiding all of the many, many pitfalls that lurk in the world of encryption. For Python programs that need encryption, I have found the bcrypt Python library to be a good choice.

Python 3.0 , encoding="utf-8" for encryption

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Use of AI (ML/DL) in Science

Other than just FizzBuzz to test programmer candidates

File Structure vs Data Structure

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight