How does Python tell that an integer is actually an integer?

  • Python
  • Thread starter Arman777
  • Start date
  • #1
2,118
179
Look at this example,

Code:
>>> a = 97
>>> type(a)
<class 'int'>
>>> bin(a)
'0b1100001'
>>> b = ord('a')
>>> b
97
>>> type(b)
<class 'int'>
>>> bin(b)
'0b1100001'

Is this means that the string 'a' and the integer 97 stored as the same binary in the memory ? If so then how can python tell the difference ? As far as I can remember in C we have to define the type of the variable before we use it. In Python we dont have to do that. In that case how it seperates integers from strings ?
 
Last edited:

Answers and Replies

  • #2
2,118
179
No offence, but is that supposed to be confusing ?
The first example is. For instance I have given you a binary '0b1100010' does it represent an integer or string ? How python understands the difference ? The second example was obvious thing so I have deleted that.
 
  • #3
hmmm27
Gold Member
777
341
We're getting some edit lag ... I deleted the post you just asked about because of a bit in the first post which I didn't see the first time, and don't see now. Ghosts, or something. [edit: which you then explained you deleted, so whatever].

I'm really not sure how you're confused, though I'm sure I am.

Who gave you that code snippet ? or did you make it up yourself.
 
Last edited:
  • #4
13,030
6,916
Python uses dynamic typing. When you assign a value to a variable it can determine the datatype at that time.

So while ‘a’ and 97 are the same in memory python sets the datatype of x to char when x=‘a’ is used to assign the value.

it is possible to mess with the storage but then you know the dangers and reap the whirlwind of disaster.
 
  • Like
Likes Janosh89 and Arman777
  • #5
Ibix
Science Advisor
Insights Author
2020 Award
8,307
7,711
Something like C stores very little except the binary value of a variable and relies on type information provided by the programmer in a declaration to know what to do with the data. Thus you can simply cast 'a' (edit: or "\0a" anyway, I think) to integer and get 97.

On the other hand, everything in python is an object, so carries around type information as well as the basic binary data. It recognises 97 as an integer when it parses your input and stores it as an int object. At least, that's the way you can think of it. There may be a lot of gory implementation details I'm not aware of.
 
Last edited:
  • Like
Likes WWGD, FactChecker, jedishrfu and 1 other person
  • #6
2,118
179
Who gave you that code snippet ?
I wrote it myself. Look at this example.
Code:
Python 3.9.5 (tags/v3.9.5:0a7dcbd, May  3 2021, 17:27:52) [MSC v.1928 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 97
>>> b = 'a'
>>> bin(a)
'0b1100001'
>>> bin(ord(b))
'0b1100001'
>>>
My question is something like this. We are storing an integer, 97, in a variable called a. We are also storing a string, "a", to another variable b. As far as I know everything is turned binary in the process of calculations. So I am thinking that variable a is stored as '0b1100001'. But that is the same as variable b, which happens to be a string.
 
  • #7
Baluncore
Science Advisor
9,696
4,138
So I am thinking that variable a is stored as '0b1100001'. But that is the same as variable b, which happens to be a string.
I see no problem with different variables having the same value, no matter what format it is stored in at the time.
 
  • #8
400
45
Since I did not know what "bin" does, I looked it up. You seem to assume it gives you the representation that Python uses internally to represent an object. I think it doesn't: "The bin() method converts and returns the binary equivalent string of a given integer. If the parameter isn't an integer, it has to implement __index__() method to return an integer." (from: https://www.programiz.com/python-programming/methods/built-in/bin). In other words: In case of variable a, it returns the "binary equivalent string of 97" (whatever that means), and in case of variable b it returns "the binary equivalent string of b.__index__()". Check if b.__index__() returns 97 :).

Bottom line: I don't think that bin() tells you how the Python interpreter stores the variable internally. The first google hit I found on it may be wrong, though.
 
  • #9
2,118
179
Since I did not know what "bin" does, I looked it up. You seem to assume it gives you the representation that Python uses internally to represent an object. I think it doesn't: "The bin() method converts and returns the binary equivalent string of a given integer. If the parameter isn't an integer, it has to implement __index__() method to return an integer." (from: https://www.programiz.com/python-programming/methods/built-in/bin). In other words: In case of variable a, it returns the "binary equivalent string of 97" (whatever that means), and in case of variable b it returns "the binary equivalent string of b.__index__()". Check if b.__index__() returns 97 :).

Bottom line: I don't think that bin() tells you how the Python interpreter stores the variable internally. The first google hit I found on it may be wrong, though.
this might be the reason
 
  • #10
2,118
179
ython uses dynamic typing. When you assign a value to a variable it can determine the datatype at that time.
This and
Something like C stores very little except the binary value of a variable and relies on type information provided by the programmer in a declaration to know what to do with the data. Thus you can simply cast 'a' (edit: or "\0a" anyway, I think) to integer and get 97.

On the other hand, everything in python is an object, so carries around type information as well as the basic binary data. It recognises 97 as an integer when it parses your input and stores it as an int object. At least, that's the way you can think of it. There may be a lot of gory implementation details I'm not aware of.
this is also makes sense..so there also other information that it takes I guess or defines
 
  • #11
2,118
179
It seems when I call ord(), the string becomes an integer...which is clear from this

Code:
>>> type(ord('a'))
<class 'int'>
>>>
which seems that I have already typed in the OP. So its no longer a string and its just a number. So the strings and integers are stored differently and I guess it has nothing do with the bin as pointed out.

Thanks for the help anyways. I dont know why but I was confused
 
  • #12
hmmm27
Gold Member
777
341
I'm still confused about what you're actually confused about.

I mean, you know there isn't a little gnome inside the memory chip painting itty-bitty lower-case a's and b's and zeros and ones onto the silicon, right ?

[edit: Characterizing the many things - like presence/absence of a magnetic field, punched hole in paper/cardboard, state of a transistor, etc. - as "ones" and "zeros", is purely for linguistic convenience. (Though yes, if the computer is doing a binary math operation, it's also convenient that the "1's" and "0's" can actually be used directly as 1's and 0's.]
 
Last edited:
  • Haha
  • Like
Likes Arman777 and jedishrfu
  • #13
13,030
6,916
As you start investigating how things actually work it’s easy to get confused about this. When we read core dumps we could separate areas of code from data and data itself into textual, integer or floating pt based on the bit patterns we saw.

Program Opcodes had a denser all bits used format but good readers could spot common opcodes and recognize memory addresses being used….

32 bit integers typically were stored in even word boundaries and always had high bits zeroed ie integer data usually used the low range of possible integer values. 32 bit floats had a similar layout but high bits were used for the value Stored as power and mantissa And low bits were zeros. Textual ascii data had high bits of every byte as zeros As ascii codes were in the 0-127 range With 228 to 255 zeroed.

Of course nowadays, a dump reader would have to contend with Unicode multi byte textual data and 16 bit, 32 bit, 64 bit … numeric data too. And then there’s the heap and stack and big median and little endians which wasnt a thing for older mainframe machines.

My dump reading experience was with GE 635 and Honeywell 6000 mainframes Which would be consider big median 36bit word architectures. Folks today use debuggers not dumps to investigate program / data memory.
 
  • #14
2,118
179
I'm still confused about what you're actually confused about.

I mean, you know there isn't a little gnome inside the memory chip painting itty-bitty lower-case a's and b's and zeros and ones onto the silicon, right ?

[edit: Characterizing the many things - like presence/absence of a magnetic field, punched hole in paper/cardboard, state of a transistor, etc. - as "ones" and "zeros", is purely for linguistic convenience. (Though yes, if the computer is doing a binary math operation, it's also convenient that the "1's" and "0's" can actually be used directly as 1's and 0's.]
Yes indeed. I was misinterpreting some of the things/ideas, I guess.

This kind of confusion is coming from this.

https://www.physicsforums.com/threa...of-strings-and-integers.1005423/#post-6519298

I guess I asked the wrong question, but in any case, it does not matter. I was thinking something like this.

Someone gives you a piece of paper and there writes '0b1100110'.

When you are doing an XOR operation, we take every letter as a string and turn them into binary. In that case, '0b1100110' is just the letter 'f,' But in general, it's just the number 102. So the representation of a binary might depend on how you are using it. Since when you are trying to encrypt/decrypt a text in XOR, you cannot encrypt/decrypt 102 as '0b1100110' (or ['0b1','0b0', '0b10'] from what I have learned) but you can do that for 'f.'

Similarly in an XOR type operation we need to represent '2' as '0b110010' but not as '0b10'.

So the bottom line is, depending on the given situation, you can map the binary representation to an integer or a strinh, which leads us to the OP. But I have realized that this is just a case-dependent situation, and it has nothing to do with how you store integers or strings.
 
Last edited:
  • #15
35,225
7,044
Is this means that the string 'a' and the integer 97 stored as the same binary in the memory ? If so then how can python tell the difference ? As far as I can remember in C we have to define the type of the variable before we use it.
If Python works anything like C or C++, 'a' is not a string, but rather a character constant or literal. In C and C++ there's a big difference between the character constant 'a' and the string literal "a" -- the former evaluates to the character whose ASCII code is 97 (i.e., 'a'), and the latter evaluates to the address in memory where the character is stored.

Thus you can simply cast 'a' (edit: or "\0a" anyway, I think) to integer and get 97.
No, 'a' and "\0a" are very different things. As already mentioned, 'a', this is a character constant. "\0a" would be a null string, since its first character is the null byte. I don't think that the character 'a' would even be stored.
 
  • Like
Likes Delta2 and Ibix
  • #16
pbuk
Science Advisor
Gold Member
2,446
1,186
If Python works anything like C or C++
It doesn't, the internals of Python are very different from C or C++. @Ibix quoted the key statement above:
everything in python is an object
You can see this in action by using sys.getsizeof() which returns the number of bytes in an entity's internal representation in the REPL:
Python 3.9 REPL, 64 bit system:
import sys
sys.getsizeof(1) # 28 (bytes)
sys.getsizeof('a') # 50 (bytes)
sys.getsizeof('ab') # 51 (bytes)
How is it that a single character takes 50 bytes? As well as the ASCII (actually Latin-1) value which takes 1 byte, Python also stores a pointer to the string class (8 bytes on a 64 bit system), the size of the string in it's fixed length internal representation and its length (the number of Unicode characters in the string), information about how the string is encoded (only 1 byte I think) etc. The full detail is contained in PEP 393.
 
  • Like
Likes Arman777, Ibix and jedishrfu
  • #17
pbuk
Science Advisor
Gold Member
2,446
1,186
"\0a" would be a null string, since its first character is the null byte. I don't think that the character 'a' would even be stored.
No, '\0' (ASCII NULL) does not act as a string terminator in Python: '\0a' has length of 2 characters.
 
  • Like
Likes Arman777 and jedishrfu
  • #18
hmmm27
Gold Member
777
341
Yes indeed. I was misinterpreting some of the things/ideas, I guess.

This kind of confusion is coming from this.

https://www.physicsforums.com/threa...of-strings-and-integers.1005423/#post-6519298

I guess I asked the wrong question, but in any case, it does not matter. I was thinking something like this.

Someone gives you a piece of paper and there writes '0b1100110'.

When you are doing an XOR operation, we take every letter as a string and turn them into binary. In that case, '0b1100110' is just the letter 'f,' But in general, it's just the number 102. So the representation of a binary might depend on how you are using it. Since when you are trying to encrypt/decrypt a text in XOR, you cannot encrypt/decrypt 102 as '0b1100110' (or ['0b1','0b0', '0b10'] from what I have learned) but you can do that for 'f.'

Similarly in an XOR type operation we need to represent '2' as '0b110010' but not as '0b10'.

So the bottom line is, depending on the given situation, you can map the binary representation to an integer or a strinh, which leads us to the OP. But I have realized that this is just a case-dependent situation, and it has nothing to do with how you store integers or strings.
Great : so, are you at the point yet where you can see that basically all you're doing is mathematically subtracting the original-message : eg: "The quick brown fox jumped over the lazy dog." from the (shared) cipher-page : eg: "Wally's World : a great place for a summer vacation" to get the encrypted-message : <garbage, mostly : don't try to print or display> ; then to unencrypt just mathematically add the cipher page to the encrypted-message to get the original message.

[edit: which makes it not an "xor encryption" feel free to ignore ; usual royalties apply, otherwise :wink: ]
 
Last edited:
  • #19
2,118
179
Great : so, are you at the point yet where you can see that basically all you're doing is mathematically subtracting the original-message : eg: "The quick brown fox jumped over the lazy dog." from the (shared) cipher-page : eg: "Wally's World : a great place for a summer vacation" to get the encrypted-message : <garbage, mostly : don't try to print or display> ; then to unencrypt just add the cipher page to the encrypted-message to get the original message.
I did not understand what are you trying to mean but I am just trying to implement XOR encryption.
 
  • #20
hmmm27
Gold Member
777
341
I did not understand what are you trying to mean but I am just trying to implement XOR encryption.
Yeah ; see what happens when you let your brain wander around off leash ? I just "invented" a "new" encryption method. (I should probably check in with a crypto forum to see what it's called)

[edit : It's a Vigenère cipher... first published in 1533.]

Meanwhile, here's an online XOR en/decryptor (the explanatory paragraph looks useful) and there's how Python does XOR's.
 
Last edited:
  • #22
PeterDonis
Mentor
Insights Author
2020 Award
35,266
13,460
I don't think that bin() tells you how the Python interpreter stores the variable internally.
That's correct, it doesn't. Note that it doesn't even work on objects that aren't integers (try calling bin() on a one-character string, for example).
 
  • #23
PeterDonis
Mentor
Insights Author
2020 Award
35,266
13,460
When you are doing an XOR operation, we take every letter as a string and turn them into binary.
What does "turn them into binary" mean? (What data type do you want this "turn them into binary" operation to produce?)
 
  • #24
2,118
179
What does "turn them into binary" mean? (What data type do you want this "turn them into binary" operation to produce?)

So for any given text message (it must be given as a string) and for a given key (it also must be a string), I am encrypting the message.

I will not use a binary key in my implementation since, for unbreakable XOR encryption, you need to generate a random binary with the same size as the len(message)*8. So for a ten-character password, you need to store 80 bits. Instead, I am taking the key as a string so that each string has one byte. It seems more reasonable if you want to store a key.

In this case, I am just turning any text message and key into binary arrays, then performing XOR operation, and then turning an encrypted binary message into an encrypted message, which is just a string.

Here the only problem is that sometimes the encrypted message takes really strange and I guess unprintable values

I wanted to share the code but the console is giving an error
 
  • #25
PeterDonis
Mentor
Insights Author
2020 Award
35,266
13,460
@Arman777, none of what you posted answers the questions I asked. "Binary" is a meaningless term as far as Python is concerned; it's not a data type. What data type do you want these "binary" thingies to be?
 

Related Threads on How does Python tell that an integer is actually an integer?

  • Last Post
Replies
11
Views
411
Replies
3
Views
719
Replies
21
Views
2K
Replies
4
Views
2K
  • Last Post
Replies
6
Views
5K
  • Last Post
Replies
3
Views
2K
Replies
21
Views
2K
Replies
2
Views
8K
  • Last Post
Replies
3
Views
2K
Top