Number of bits it takes to represent a number

iScience · Mar 13, 2016

Is this accurate?

$$x = some\_number$$

$$bits(x)= \frac{log(x)}{log(2)}$$

mfb · Mar 13, 2016

If you have some way to signalize "the number is done", and send every integer in binary: sure (up to rounding).

Mark44 · Mar 13, 2016

iScience said:

Is this accurate?

$$x = some\_number$$

$$bits(x)= \frac{log(x)}{log(2)}$$

No, not quite. The fraction you have on the right is the same as ##log_2(x)##.

As to why your formula isn't correct, consider x = 4, and that ##log_2(4) = 2##. It takes 3 bits (##100_2##) to represent 4.

mfb · Mar 13, 2016

Mark44 said:

As to why your formula isn't correct, consider x = 4, and that ##log_2(4) = 2##. It takes 3 bits (##100_2##) to represent 4.

If you want to represent positive integers only, you can drop the leading 1. It will be there for every number anyway.

Mark44 · Mar 13, 2016

Mark44 said:

As to why your formula isn't correct, consider x = 4, and that ##log_2(4) = 2##. It takes 3 bits (##100_2##) to represent 4.

mfb said:

If you want to represent positive integers only, you can drop the leading 1.

I don't understand what you're saying. I was only considering positive integers. The binary representation of 4 as an unsigned number is ##100_2##. Are you interpreting the 1 digit to mean the number is negative?

The OP's formula doesn't give the right results for this and many other numbers

mfb said:

It will be there for every number anyway.

?

mfb · Mar 13, 2016

The binary representation is 100. Who said we are limited to the binary representation? We can save bits with the following scheme, assuming we know where digits end:

1: ""
2: "0"
3: "1"
4: "00"
5: "01"
6: "10"
and so on.

Mark44 said:

?

The binary representation for every positive integer starts with a 1. If you are interested in saving bits, as the title suggests, you do not have to store this 1. Floating point numbers do exactly this: they just do not store the 1 because it would be fully redundant.

Mark44 · Mar 13, 2016

mfb said:

The binary representation for every positive integer starts with a 1. If you are interested in saving bits, as the title suggests, you do not have to store this 1.

The trouble is, negative integers are stored with a 1 digit in the most significant bit (MSB).
The title of the thread is "Number of bits it takes to represent a number". With "bits" which is short for "binary digits," it's reasonable to assume that we're talking about a binary representation. It takes three bits to represent the decimal number 4. If you take advantage of a scheme that stores only two bits, it still takes three bits to represent the number.

mfb said:

Floating point numbers do exactly this: they just do not store the 1 because it would be fully redundant.

Some floating point numbers do this. The ones that don't follow the older x87 Extended Precision format.

In contrast to the single and double-precision formats, this format does not utilize an implicit/hidden bit.

The Borland C/C++ compiler back in the early 90's had an 80-bit long double type based on this format. The Microsoft compilers never did, to the best of my knowledge.

DrZoidberg · Mar 13, 2016

Mark44 said:

No, not quite. The fraction you have on the right is the same as ##log_2(x)##.

As to why your formula isn't correct, consider x = 4, and that ##log_2(4) = 2##. It takes 3 bits (##100_2##) to represent 4.

If you want to represent integer numbers between 0 and x (inclusive) the correct equation is ##ceiling(log_2(x+1))##.

mfb · Mar 13, 2016

Mark44 said:

The title of the thread is "Number of bits it takes to represent a number". With "bits" which is short for "binary digits," it's reasonable to assume that we're talking about a binary representation.

I don't think so, and even if we would, storing them does not have to happen in the same way we would write those numbers down on paper.

Mark44 said:

Some floating point numbers do this.

Well, most do. A few do not.

Mark44 · Mar 13, 2016

Mark44 said:

The title of the thread is "Number of bits it takes to represent a number". With "bits" which is short for "binary digits," it's reasonable to assume that we're talking about a binary representation.

mfb said:

I don't think so, and even if we would, storing them does not have to happen in the same way we would write those numbers down on paper.

Of course, but this thread is about how they are represented, not how they can be stored. Whether one bit is implied or not, it takes three bits to represent 4, so the OP's formula as stated doesn't give the right result.

Tom.G · Mar 14, 2016

Represent to a human or to a computer?

Mark44 · Mar 14, 2016

Tom.G said:

Represent to a human or to a computer?

Both.

mfb · Mar 14, 2016

I can represent 4 as "00", and no one can stop me.
"Number of bits it takes" implies that we want to minimize the number we need. Using binary representation would need one bit more than necessary.

Mark44 · Mar 14, 2016

mfb said:

I can represent 4 as "00", and no one can stop me.
"Number of bits it takes" implies that we want to minimize the number we need. Using binary representation would need one bit more than necessary.

I think that you are arguing just to be arguing. Given that this thread is in the Programming and Computer Science section, can you give us an example of one computer system or programming language or standard in which 4 is represented as "00"? I'll stipulate that the IEEE 754 standard for single-precision and double-precision floating point numbers does use an implied bit that is not stored, but can you show a similar standard for integer values?

mfb · Mar 14, 2016

I'm not aware of any computer system which stores 4 as "100" either. 00000100 and similar - sure, and you cannot drop a leading 1 here because there is none. But a storage format that uses exactly the number of bits it needs (plus one)?

Mark44 · Mar 14, 2016

mfb said:

I'm not aware of any computer system which stores 4 as "100" either. 00000100

I wrote 100₂ as an abbreviation for the full byte representation 00000100₂.

mfb · Mar 14, 2016

But then we use 8 bits, not 3.

jbriggs444 · Mar 14, 2016

Mark44 said:

I think that you are arguing just to be arguing. Given that this thread is in the Programming and Computer Science section, can you give us an example of one computer system or programming language or standard in which 4 is represented as "00"? I'll stipulate that the IEEE 754 standard for single-precision and double-precision floating point numbers does use an implied bit that is not stored, but can you show a similar standard for integer values?

Data compression. A 4 can be represented and stored in lots of ways. Those ways can depend on context. If I have a long string of 4's, I can represent those 4's pretty cheaply.

Ibix · Mar 14, 2016

In short, OP, if you're still reading, the answer has some surprising complexities. In a straightforward binary representation of a positive integer x, the most significant bit would be the largest integer n such that ##n\leq\ln(x+1)/\ln 2##, which is not quite what you wrote. However, that is not the number of bits needed to store it (except in certain special cases).

At least one more bit is needed if you want to handle negative numbers as well (and the answer is complicated for negative integers)
Some way to know when the number ends is also needed, which will typically add to the storage - either through wasted space in general, having to have a "number of bits" auxiliary variable, or having to reserve a particular sequence as a terminator.
Much more complex representations are possible, as jbriggs444 and mfb point out.

The right answer depends on why you want to know.

glappkaeft · Mar 14, 2016

The point most people participating in the thread should remember is that the Shannon information theorem does not consider the value of the largest number you want to represent but instead only the number of symbols you want to represent.

Thus

x = some number uses is {x} thus log2(1) bits
{1, 2, ..., x} uses log2(x) bits
{0, 1, ..., x} uses log2(x+1) bits
and many other possibilities

newjerseyrunner · Mar 18, 2016

mfb said:

I'm not aware of any computer system which stores 4 as "100" either. 00000100 and similar - sure, and you cannot drop a leading 1 here because there is none. But a storage format that uses exactly the number of bits it needs (plus one)?

There are many formats that compress all needed data down to only a few bytes, I've run across them mostly when doing serial communications. They try to pack as much information as possible in those bits.

The computer will align everything in terms of 8 bit blocks, but there is no reason that you actually need to use 8 bits in your code:

C:

struct packedStructure {
    unsigned int threeBits : 3;
};

int main(int, char**){
    packedStructure mybits;
    mybits.threebits = 4;  //Okay
    mybits.threebits = 8;  //Compile with -Wall and this will warn you of an overflow
    return 0;
}

rcgldr · Mar 18, 2016

Mark44 said:

The Borland C/C++ compiler back in the early 90's had an 80-bit long double type based on this format. The Microsoft compilers never did, to the best of my knowledge.

Most if not all 16 bit Microsoft compilers supported 80 bit long doubles. This was dropped in 32 bit and 64 bit Microsoft compilers, and long double is now treated the same as double (64 bits).

glappkaeft · Mar 18, 2016

rcgldr said:

Most if not all 16 bit Microsoft compilers supported 80 bit long doubles. This was dropped in 32 bit and 64 bit Microsoft compilers, and long double is now treated the same as double (64 bits).

That has never been true, the latest VS compilers from 2015 still supports 80 bit floating point and often still default to it.

https://msdn.microsoft.com/en-us/library/9cx8xs15.aspx

Mark44 · Mar 18, 2016

glappkaeft said:

That has never been true, the latest VS compilers from 2015 still supports 80 bit floating point and often still default to it.

https://msdn.microsoft.com/en-us/library/9cx8xs15.aspx

Did you read the page you linked to? From that page:

Previous 16-bit versions of Microsoft C/C++ and Microsoft Visual C++ supported the long double, 80-bit precision data type. In Win32 programming, however, the long double data type maps to the double, 64-bit precision data type.

Per the quote from the MSDN page whose link you provided, Microsoft compilers do not support 80-bit floating point numbers. You can declare a variable of type long double, but that's equivalent to declaring it double; i.e., 64-bit floating point.

meBigGuy · Mar 19, 2016

WOW -- talk about noise.

It's pretty simple really:

You can represent 4 values with 2 bits. 0,1,2,3 or 1 2 3 4 or 1 0 -1 -2. or 13, 17, 21, 77. Call them whatever you want. It's 4 unique values of your choosing.
Or you can say that 2 bits can be arranged in 4 different ways.
If you want to represent 5 values, then you need 2.321 bits, but fractional bits are difficult to implement, so you round up to 3

The original formula is correct if you round up, and are speaking of the number of values, not some arbitrary specific value.

.Scott · Apr 6, 2016

glappkaeft said:

The point most people participating in the thread should remember is that the Shannon information theorem does not consider the value of the largest number you want to represent but instead only the number of symbols you want to represent.

Thus

x = some number uses is {x} thus log2(1) bits

{1, 2, ..., x} uses log2(x) bits

{0, 1, ..., x} uses log2(x+1) bits

and many other possibilities

In the spirit of "you know what I mean", the answer is one of these:

[itex]x = some\_unsigned\_number[/itex]
[itex]bits(x)= \frac{log(x+1)}{log(2)}, rounded\ up[/itex] except that bits(0) should probably be defined as 1

or

[itex]x_p = some\_signed\_positive\_number[/itex]
[itex]bits(x_p)= 1+\frac{log(x_p+1)}{log(2)}, rounded\ up[/itex]
[itex]x_n = some\_signed\_negative\_number[/itex]
[itex]bits(x_n)= 1+\frac{log(-x_n)}{log(2)}, rounded\ up[/itex]

However, taking your question literally, as glappkaeft has done, the required number of bits is determined by the number of symbols. So:

[itex]x_s = number\_of\_distinctive\_symbols[/itex]
[itex]bits(x_s)= \frac{log(x_s)}{log(2)}, rounded\ up[/itex]

Here are some examples:
Number of bits to encode the unsigned value of 4: 3; "100".
Number of bits to encode the signed value of 4: 4; "0100".
Number of bits to encode -4: 3; "100".
Number of bits to encode the current season: 2; Spring, Summer, Fall, or Winter
Number of bit to encode whether x is equal to 4: 1; true or false

Jarvis323 · Apr 6, 2016

mfb said:

The binary representation is 100. Who said we are limited to the binary representation? We can save bits with the following scheme, assuming we know where digits end:

1: ""
2: "0"
3: "1"
4: "00"
5: "01"
6: "10"
and so on.
The binary representation for every positive integer starts with a 1. If you are interested in saving bits, as the title suggests, you do not have to store this 1. Floating point numbers do exactly this: they just do not store the 1 because it would be fully redundant.

mfb said:

I can represent 4 as "00", and no one can stop me.
"Number of bits it takes" implies that we want to minimize the number we need. Using binary representation would need one bit more than necessary.

If we are playing this game, then the answer is always 0. Just use the null bit ""

mfb · Apr 7, 2016

Which number does it represent, and what about all the others?

.Scott · Apr 7, 2016

Jarvis323 said:

If we are playing this game, then the answer is always 0. Just use the null bit ""

If it is given that already know it's a four, then you don't need to store that information. You see that with optimizing compilers - they'll deal with the "4" at compile time and no "4" will exist in the actual executable code. For example, with "float area=4*3.14159*(radius*radius);", don't expect to see a four in the executable code.

jbriggs444 · Apr 7, 2016

mfb said:

Which number does it represent, and what about all the others?

Depends on what game we are playing. One game we could play is:

You pick a number. I tell you how many bits it takes to represent that number. You can then choose either to produce a coding scheme that takes fewer bits to encode that number (if you can do so, you win, if you cannot, you lose) or challenge me to produce a coding scheme that takes that number of bits or fewer (if I can do so, I win, if I cannot, I lose).

The representations of numbers not picked by you do not affect the results of this game. The obvious choice is for me to always respond with "zero bits" and produce a coding scheme that associates your number with the only message in a space of one message.

It takes zero bits to select one possibility from a set of one possibility.

Number of bits it takes to represent a number

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Use of AI (ML/DL) in Science

Sweetspot of data compression

Other than just FizzBuzz to test programmer candidates

How to show RS(U+TRS)* is equivalent to (R+SUT)SU?

HTML/CSS Problems with DNS records

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect