Python How does Python tell that an integer is actually an integer?

  • Thread starter Thread starter Arman777
  • Start date Start date
  • Tags Tags
    Integer Python
Click For Summary
The discussion revolves around the differences in how Python and C handle data types, particularly strings and integers, and the implications for binary representation and memory storage. It highlights Python's dynamic typing, where the interpreter determines the data type at runtime, contrasting with C's requirement for explicit type declarations. Participants explore the use of the `bin()` function, clarifying that it converts integers to their binary string representation but does not reflect how Python stores these values internally. The conversation also touches on XOR encryption, emphasizing that operations should be performed on integers rather than binary string representations. The importance of using appropriate data types, such as bytes for binary data, is underscored, with suggestions for using hexadecimal for more compact and readable representations. Overall, the thread illustrates common misconceptions about data types and memory management in Python, while providing insights into effective coding practices.
  • #31
See this code please

https://codeshare.io/EBkJrO

Edit: I have fixed the Example in byteArray function
 
Last edited:
  • Wow
Likes pbuk
Technology news on Phys.org
  • #32
PeterDonis said:
In Python 3, the "string" data type (i.e., str) isn't bytes, it's Unicode. The Python 3 "byte string" data type is bytes.
I meant this

Code:
>>> bin(97)
'0b1100001'
>>> type(bin(97))
<class 'str'>
>>>

I mean I am doing the XOR operation by taking these and ofc there are intermediate steps.
 
  • #33
PeterDonis said:
This doesn't make sense. One byte is one byte, whether you call it a "string" or something else.
For instance you can create a XOR encryption that can take the key as bits (a binary key) or you can create an ASCII key (see this site https://www.dcode.fr/xor-cipher)

PeterDonis said:
What does make a difference, though, is that if you store the key as a single string, you have one string object in Python, which will take storage equal to the number of bytes in the string plus the Python overhead for one string object. Whereas if you store the key as a bunch of one-byte strings, you will need storage equal to the number of bytes in the string, plus the Python overhead for the same number of Python string objects (in your case, ten) instead of just one string object. That can be a lot more storage.
In my previous post, I was trying to say that you'll need a lot of storage if you store the key as binary. But an ASCII key will be more useful in terms of storage.

If your message is "password," and if you want to make the XOR encryption "unbreakable," you need to create a random binary key that is 64 bits long.

For instace, I have generated a random binary key. So for an unbreakable message you'll need to store this.

0000110011010000111111100100101110110010010011110110110000101011

but if you convert this into ASCII (or as ASCII key), you'll need to store just

♀ÐþK²Ol+

Here, the only problem is that the binary representation of the key can also represent the unprintable ASCII characters. So when you try to convert a random binary key into ASCII key, you'll see really random characters and most of the times unprintable ones. That also applies if you want decrypt them.

So even creating a binary key is advantageous in terms of encryption (since you don't have to worry about if its printable or not), it's disadvantageous in terms of storage (ofc that is really personal but that's true for me)
 
  • #34
Arman777 said:
See this code please

https://codeshare.io/EBkJrO

Edit: I have fixed the Example in byteArray function
Wow, that is some pretty complicated code for doing something pretty simple. I think the best learning point you can take away from this is 'representing binary data as a string of 0's and 1's is a really bad idea'.

A string is a better way (and in Python we have an even better way, bytes, as @PeterDonis said), but as you have seen this can lead to unprintable characters. There are a number of ways of dealing with this: one of the most common is hexadecimal: any 8 bit value can be represented by two characters in the range [0..9,a..f].

So if you want a readable binary key and encoded message, you could display them in hex; however for internal working it makes a lot more sense to use bytes, or a string (using all values 0-255) if you must, or even a list or array of integers. Anything but a string of '1's and '0's.
 
  • Like
Likes Arman777 and Vanadium 50
  • #35
Well I am not a mind reader and no is telling me to do something like this, which I was not aware until now.

Instead of turning 67 and 45 into bytes and then doing XOR bitwise I could have just do

bin(67 ^ 45)

which gives the correct answer. This approach will definately shorten my code

You guys are giving hints but I cannot understand something that i don't know...

Hex is a good approach
 
  • #36
Arman777 said:
I meant this
Then you are using the wrong operation. You don't want to convert an integer to a string. You want to convert a string (the message) into an integer (or a sequence of integers).

Arman777 said:
In my previous post, I was trying to say that you'll need a lot of storage if you store the key as binary. But an ASCII key will be more useful in terms of storage.
You are very confused.

First, as I have already pointed out, "binary" is not a Python data type. What you are calling "binary" are Python strings that contain "0" and "1" characters that can be interpreted as the bits in a binary representation of an integer, but that does not mean you should actually use this for any kind of arithmetical or logical operation. That's not what the bin() function is for.

You say this bin() conversion is an intermediate step, but I don't think you've fully thought through what you are doing. If you already have an integer representing a character in the message and an integer representing a character in the key, you can just xor them directly in Python. There is no need to convert them into these "binary" strings.

Arman777 said:
If your message is "password," and if you want to make the XOR encryption "unbreakable," you need to create a random binary key that is 64 bits long.
Yes, and this is a 64-bit integer. It's not a string of 64 bytes that are either "0" or "1".

Arman777 said:
For instace, I have generated a random binary key. So for an unbreakable message you'll need to store this.

0000110011010000111111100100101110110010010011110110110000101011

but if you convert this into ASCII (or as ASCII key), you'll need to store just

♀ÐþK²Ol+
First, most of those characters aren't ASCII characters, so the term "ASCII" is incorrect.

Second, since what you have is a 64-bit integer, you should just represent it as a Python integer and do Python operations on it directly as an integer.

Arman777 said:
Here, the only problem is that the binary representation of the key can also represent the unprintable ASCII characters.
The "unprintable" characters aren't ASCII characters to begin with, as noted above. But more important, viewing them as "characters" makes no sense. As above, you have a 64-bit integer. You can represent integers directly in Python as integers.

Arman777 said:
Instead of turning 67 and 45 into bytes and then doing XOR bitwise I could have just do

bin(67 ^ 45)

which gives the correct answer. This approach will definately shorten my code
Exactly! Except that there is no need for the bin() function anywhere. You have an integer; if you want to convert it to a character, you call ord() on it, not bin().

Arman777 said:
You guys are giving hints but I cannot understand something that i don't know...
What are you talking about? You just did understand it (in what I quoted above and responded to with "Exactly!").

Arman777 said:
Hex is a good approach
What? Why are you throwing away the right answer right after you found it?
 
  • Like
Likes Arman777
  • #37
PeterDonis said:
What you are calling "binary" are Python strings that contain "0" and "1" characters that can be interpreted as the bits in a binary representation of an integer
Again, the OP seems to be confused about the difference between a numeric digit character, like '0' and a numeral, like 0.

The character '0' has an ASCII code of 48 (or 0x30 in hex or b110000 in binary). The number 0 has all its bits cleared to 0.
 
  • Like
Likes Arman777
  • #38
PeterDonis said:
Then you are using the wrong operation. You don't want to convert an integer to a string. You want to convert a string (the message) into an integer (or a sequence of integers).
Yes you are right I have realized that later on.
PeterDonis said:
You say this bin() conversion is an intermediate step, but I don't think you've fully thought through what you are doing. If you already have an integer representing a character in the message and an integer representing a character in the key, you can just xor them directly in Python. There is no need to convert them into these "binary" strings.
Yes you are right. The problem was I was not aware of that kind of operation was avaliable in python. So that was why I was trying to strange stuff.
PeterDonis said:
Exactly! Except that there is no need for the bin() function anywhere. You have an integer; if you want to convert it to a character, you call ord() on it, not bin().
Yes I have also realized that
PeterDonis said:
What are you talking about? You just did understand it (in what I quoted above and responded to with "Exactly!").
Well It was a late enlightenment for me. In my head I was keep repeating what these guys want.
PeterDonis said:
What? Why are you throwing away the right answer right after you found it?
I was just thinking that can be a good idea. But I can also change the encrypted message to to a list of integers (or hex) which seems more reasonable. Such as

encrypted_message = encryptXOR('I have a dream', '999')
decrypted_message = decryptXOR(encrypted_message, '999')


print(encrypted_message)
print(decrypted_message)


will output

[112, 25, 81, 88, 79, 92, 25, 88, 25, 93, 75, 92, 88, 84]
I have a dream
 
Last edited:
  • #39
PeterDonis said:
Arman777 said:
Hex is a good approach
What? Why are you throwing away the right answer right after you found it?
I think @arman here means that he realizes hex can be useful for external representation of arbitrary 8 bit values; hex is much more compact and easier to read than strings of 0's and 1's and does not suffer the problem of unprintable characters of direct extended ASCII rendering.

The light regarding internal representation does seem to have clicked on :biggrin:
 
  • Like
Likes Arman777
  • #40
pbuk said:
I think @arman here means that he realizes hex can be useful for external representation of arbitrary 8 bit values; hex is much more compact and easier to read than strings of 0's and 1's and does not suffer the problem of unprintable characters of direct extended ASCII rendering.

The light regarding internal representation does seem to have clicked on :biggrin:
Yes exactly. I was talking about external representation
 
  • #41
I think Python determines types at runtime unlike other languages like C++.
 
  • #42
Arman777 said:
Yes exactly. I was talking about external representation
That goes a long way towards explaining why I spent the last couple of pages thinking that at least one of us is an idiot. Sure, you can do the whole thing in display-types ; it's a CLI ; machine efficiency isn't an issue.

What course/textbook ? what's the title of the chapter ? What's the exact wording of the problem ?
(ie: "summary"),

What coding, language, application constructs do you think would be useful ?
(ie: "equations")

What have you accomplished, and how are you stuck ?
 
Last edited:
  • #43
hmmm27 said:
one of us is an idiot
No one has to be an idiot. Its important to learn from our mistakes and we may not be know everything.
hmmm27 said:
What have you accomplished, and how are you stuck ?
I have finished the coding actually. I even created a GUI

You can check here

https://github.com/seVenVo1d/random/tree/master/XOR%20Encryption

I could have shared the code here but the site is giving errors.
 
  • #44
Interesting the word 'pointer' was used once in this whole thread, didn't see registers mention either. Study an introduction into assembly language then the OP should be able to easily understand..
 
  • Skeptical
Likes pbuk
  • #45
How do you think a knowledge of assembly language and CPU architecture will help the OP understand how Python remembers that a variable holds an integer value?
 
  • #46
Arman777 said:
I have finished the coding actually. I even created a GUI

https://github.com/seVenVo1d/random/tree/master/XOR%20Encryption

Your link is wrong.

I looked at the non-gui code.

This actually works with unicode, not just ascii. You can happy type accented letters such as å, chinese characters such as 龙, and emoji such as 🔦 into it, and it'll still work. One of the benefits of Python 3.

As a further exercise, you might want to change the output into some kind of more compact format. A cyphertex of "A@E3" might be more useful and portable than "0x1F,0x02,0x3435".
 
  • #47
lyuc said:
Your link is wrong.
Thats possible. I am changing the names and the code itself most of the times.
lyuc said:
As a further exercise, you might want to change the output into some kind of more compact format.
I have thought about it and limit the output only printible characters or as you have pointed out I could have removed the '\x' in front of them
lyuc said:
I looked at the non-gui code.
Thanks for checking it out.
 
Last edited:
  • #48
The lazy way to do it would be to take the raw bytes and then use base64 encoding, or uuencode.
 

Similar threads

Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
55
Views
6K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 10 ·
Replies
10
Views
5K
Replies
5
Views
1K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K