How does Python tell that an integer is actually an integer?

  • Context: Python 
  • Thread starter Thread starter Arman777
  • Start date Start date
  • Tags Tags
    Integer Python
Click For Summary
SUMMARY

The discussion centers on how Python differentiates between integers and strings despite both being represented in binary form. Python employs dynamic typing, allowing it to determine the data type at the time of assignment, unlike C, which requires explicit type declarations. The examples provided illustrate that while both the integer 97 and the string 'a' can yield the same binary representation, Python maintains type information for each object, enabling it to distinguish between them. The conversation also touches on the internal representation of objects in Python, highlighting that everything in Python is an object, which includes metadata about the type and size.

PREREQUISITES
  • Understanding of Python 3.x dynamic typing
  • Familiarity with binary representation of data
  • Knowledge of Python's object model and memory management
  • Basic concepts of data types in programming languages
NEXT STEPS
  • Explore Python's dynamic typing and how it differs from static typing in languages like C
  • Investigate the internal representation of objects in Python, focusing on PEP 393
  • Learn about the sys.getsizeof() function to understand memory usage of different data types
  • Study the implications of type information in Python's memory management and performance
USEFUL FOR

Python developers, computer science students, and anyone interested in understanding Python's data type handling and memory management.

  • #31
See this code please

https://codeshare.io/EBkJrO

Edit: I have fixed the Example in byteArray function
 
Last edited:
  • Wow
Likes   Reactions: pbuk
Technology news on Phys.org
  • #32
PeterDonis said:
In Python 3, the "string" data type (i.e., str) isn't bytes, it's Unicode. The Python 3 "byte string" data type is bytes.
I meant this

Code:
>>> bin(97)
'0b1100001'
>>> type(bin(97))
<class 'str'>
>>>

I mean I am doing the XOR operation by taking these and ofc there are intermediate steps.
 
  • #33
PeterDonis said:
This doesn't make sense. One byte is one byte, whether you call it a "string" or something else.
For instance you can create a XOR encryption that can take the key as bits (a binary key) or you can create an ASCII key (see this site https://www.dcode.fr/xor-cipher)

PeterDonis said:
What does make a difference, though, is that if you store the key as a single string, you have one string object in Python, which will take storage equal to the number of bytes in the string plus the Python overhead for one string object. Whereas if you store the key as a bunch of one-byte strings, you will need storage equal to the number of bytes in the string, plus the Python overhead for the same number of Python string objects (in your case, ten) instead of just one string object. That can be a lot more storage.
In my previous post, I was trying to say that you'll need a lot of storage if you store the key as binary. But an ASCII key will be more useful in terms of storage.

If your message is "password," and if you want to make the XOR encryption "unbreakable," you need to create a random binary key that is 64 bits long.

For instace, I have generated a random binary key. So for an unbreakable message you'll need to store this.

0000110011010000111111100100101110110010010011110110110000101011

but if you convert this into ASCII (or as ASCII key), you'll need to store just

♀ÐþK²Ol+

Here, the only problem is that the binary representation of the key can also represent the unprintable ASCII characters. So when you try to convert a random binary key into ASCII key, you'll see really random characters and most of the times unprintable ones. That also applies if you want decrypt them.

So even creating a binary key is advantageous in terms of encryption (since you don't have to worry about if its printable or not), it's disadvantageous in terms of storage (ofc that is really personal but that's true for me)
 
  • #34
Arman777 said:
See this code please

https://codeshare.io/EBkJrO

Edit: I have fixed the Example in byteArray function
Wow, that is some pretty complicated code for doing something pretty simple. I think the best learning point you can take away from this is 'representing binary data as a string of 0's and 1's is a really bad idea'.

A string is a better way (and in Python we have an even better way, bytes, as @PeterDonis said), but as you have seen this can lead to unprintable characters. There are a number of ways of dealing with this: one of the most common is hexadecimal: any 8 bit value can be represented by two characters in the range [0..9,a..f].

So if you want a readable binary key and encoded message, you could display them in hex; however for internal working it makes a lot more sense to use bytes, or a string (using all values 0-255) if you must, or even a list or array of integers. Anything but a string of '1's and '0's.
 
  • Like
Likes   Reactions: Arman777 and Vanadium 50
  • #35
Well I am not a mind reader and no is telling me to do something like this, which I was not aware until now.

Instead of turning 67 and 45 into bytes and then doing XOR bitwise I could have just do

bin(67 ^ 45)

which gives the correct answer. This approach will definitely shorten my code

You guys are giving hints but I cannot understand something that i don't know...

Hex is a good approach
 
  • #36
Arman777 said:
I meant this
Then you are using the wrong operation. You don't want to convert an integer to a string. You want to convert a string (the message) into an integer (or a sequence of integers).

Arman777 said:
In my previous post, I was trying to say that you'll need a lot of storage if you store the key as binary. But an ASCII key will be more useful in terms of storage.
You are very confused.

First, as I have already pointed out, "binary" is not a Python data type. What you are calling "binary" are Python strings that contain "0" and "1" characters that can be interpreted as the bits in a binary representation of an integer, but that does not mean you should actually use this for any kind of arithmetical or logical operation. That's not what the bin() function is for.

You say this bin() conversion is an intermediate step, but I don't think you've fully thought through what you are doing. If you already have an integer representing a character in the message and an integer representing a character in the key, you can just xor them directly in Python. There is no need to convert them into these "binary" strings.

Arman777 said:
If your message is "password," and if you want to make the XOR encryption "unbreakable," you need to create a random binary key that is 64 bits long.
Yes, and this is a 64-bit integer. It's not a string of 64 bytes that are either "0" or "1".

Arman777 said:
For instace, I have generated a random binary key. So for an unbreakable message you'll need to store this.

0000110011010000111111100100101110110010010011110110110000101011

but if you convert this into ASCII (or as ASCII key), you'll need to store just

♀ÐþK²Ol+
First, most of those characters aren't ASCII characters, so the term "ASCII" is incorrect.

Second, since what you have is a 64-bit integer, you should just represent it as a Python integer and do Python operations on it directly as an integer.

Arman777 said:
Here, the only problem is that the binary representation of the key can also represent the unprintable ASCII characters.
The "unprintable" characters aren't ASCII characters to begin with, as noted above. But more important, viewing them as "characters" makes no sense. As above, you have a 64-bit integer. You can represent integers directly in Python as integers.

Arman777 said:
Instead of turning 67 and 45 into bytes and then doing XOR bitwise I could have just do

bin(67 ^ 45)

which gives the correct answer. This approach will definitely shorten my code
Exactly! Except that there is no need for the bin() function anywhere. You have an integer; if you want to convert it to a character, you call ord() on it, not bin().

Arman777 said:
You guys are giving hints but I cannot understand something that i don't know...
What are you talking about? You just did understand it (in what I quoted above and responded to with "Exactly!").

Arman777 said:
Hex is a good approach
What? Why are you throwing away the right answer right after you found it?
 
  • Like
Likes   Reactions: Arman777
  • #37
PeterDonis said:
What you are calling "binary" are Python strings that contain "0" and "1" characters that can be interpreted as the bits in a binary representation of an integer
Again, the OP seems to be confused about the difference between a numeric digit character, like '0' and a numeral, like 0.

The character '0' has an ASCII code of 48 (or 0x30 in hex or b110000 in binary). The number 0 has all its bits cleared to 0.
 
  • Like
Likes   Reactions: Arman777
  • #38
PeterDonis said:
Then you are using the wrong operation. You don't want to convert an integer to a string. You want to convert a string (the message) into an integer (or a sequence of integers).
Yes you are right I have realized that later on.
PeterDonis said:
You say this bin() conversion is an intermediate step, but I don't think you've fully thought through what you are doing. If you already have an integer representing a character in the message and an integer representing a character in the key, you can just xor them directly in Python. There is no need to convert them into these "binary" strings.
Yes you are right. The problem was I was not aware of that kind of operation was avaliable in python. So that was why I was trying to strange stuff.
PeterDonis said:
Exactly! Except that there is no need for the bin() function anywhere. You have an integer; if you want to convert it to a character, you call ord() on it, not bin().
Yes I have also realized that
PeterDonis said:
What are you talking about? You just did understand it (in what I quoted above and responded to with "Exactly!").
Well It was a late enlightenment for me. In my head I was keep repeating what these guys want.
PeterDonis said:
What? Why are you throwing away the right answer right after you found it?
I was just thinking that can be a good idea. But I can also change the encrypted message to to a list of integers (or hex) which seems more reasonable. Such as

encrypted_message = encryptXOR('I have a dream', '999')
decrypted_message = decryptXOR(encrypted_message, '999')


print(encrypted_message)
print(decrypted_message)


will output

[112, 25, 81, 88, 79, 92, 25, 88, 25, 93, 75, 92, 88, 84]
I have a dream
 
Last edited:
  • #39
PeterDonis said:
Arman777 said:
Hex is a good approach
What? Why are you throwing away the right answer right after you found it?
I think @arman here means that he realizes hex can be useful for external representation of arbitrary 8 bit values; hex is much more compact and easier to read than strings of 0's and 1's and does not suffer the problem of unprintable characters of direct extended ASCII rendering.

The light regarding internal representation does seem to have clicked on :biggrin:
 
  • Like
Likes   Reactions: Arman777
  • #40
pbuk said:
I think @arman here means that he realizes hex can be useful for external representation of arbitrary 8 bit values; hex is much more compact and easier to read than strings of 0's and 1's and does not suffer the problem of unprintable characters of direct extended ASCII rendering.

The light regarding internal representation does seem to have clicked on :biggrin:
Yes exactly. I was talking about external representation
 
  • #41
I think Python determines types at runtime unlike other languages like C++.
 
  • #42
Arman777 said:
Yes exactly. I was talking about external representation
That goes a long way towards explaining why I spent the last couple of pages thinking that at least one of us is an idiot. Sure, you can do the whole thing in display-types ; it's a CLI ; machine efficiency isn't an issue.

What course/textbook ? what's the title of the chapter ? What's the exact wording of the problem ?
(ie: "summary"),

What coding, language, application constructs do you think would be useful ?
(ie: "equations")

What have you accomplished, and how are you stuck ?
 
Last edited:
  • #43
hmmm27 said:
one of us is an idiot
No one has to be an idiot. Its important to learn from our mistakes and we may not be know everything.
hmmm27 said:
What have you accomplished, and how are you stuck ?
I have finished the coding actually. I even created a GUI

You can check here

https://github.com/seVenVo1d/random/tree/master/XOR%20Encryption

I could have shared the code here but the site is giving errors.
 
  • #44
Interesting the word 'pointer' was used once in this whole thread, didn't see registers mention either. Study an introduction into assembly language then the OP should be able to easily understand..
 
  • Skeptical
Likes   Reactions: pbuk
  • #45
How do you think a knowledge of assembly language and CPU architecture will help the OP understand how Python remembers that a variable holds an integer value?
 
  • #46
Arman777 said:
I have finished the coding actually. I even created a GUI

https://github.com/seVenVo1d/random/tree/master/XOR%20Encryption

Your link is wrong.

I looked at the non-gui code.

This actually works with unicode, not just ascii. You can happy type accented letters such as å, chinese characters such as 龙, and emoji such as 🔦 into it, and it'll still work. One of the benefits of Python 3.

As a further exercise, you might want to change the output into some kind of more compact format. A cyphertex of "A@E3" might be more useful and portable than "0x1F,0x02,0x3435".
 
  • #47
lyuc said:
Your link is wrong.
Thats possible. I am changing the names and the code itself most of the times.
lyuc said:
As a further exercise, you might want to change the output into some kind of more compact format.
I have thought about it and limit the output only printible characters or as you have pointed out I could have removed the '\x' in front of them
lyuc said:
I looked at the non-gui code.
Thanks for checking it out.
 
Last edited:
  • #48
The lazy way to do it would be to take the raw bytes and then use base64 encoding, or uuencode.
 

Similar threads

Replies
1
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
Replies
55
Views
7K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 10 ·
Replies
10
Views
6K
Replies
5
Views
1K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 23 ·
Replies
23
Views
2K