# Why Is There No SI Unit For Information?

• I
• Nim

#### Nim

Why is the quantity "information" missing from the SI system?

Also, if they did add it, do you think it would be added as a base unit or dimensionless derived unit?

I think the unit of information is the Bit.

berkeman, russ_watters, Delta2 and 1 other person
Why is the quantity "information" missing from the SI system?
I don't think it is "missing", all scientifically pertinent information is standardized as SI values.

sophiecentaur
I think the unit of information is the Bit.
Information is a dimensionless quantity: the negative log of the probability of occurrence. The question of units comes down to the question of what base to use when taking the log. Euler's number is a good choice. Base 2 (i.e. the bit) is another viable choice.

jim mcnamara and sophiecentaur
I think the unit of information is the Bit.
That's only a binary digit. There's nothing fundamental about binary coding of information. Standard computer memory capacity can be described in numbers of bits and the bit is useful as a comparison unit but the possible information content is way beyond the memory capacity.
This wiki link will give you some background to Information Theory. Interestingly, the original work on Information theory did not use 'bits' because binary computers were in their infancy. Shannon started off using Morse Code as a model.

UsableThought and jbriggs444
This wiki link will give you some background to Information Theory. Interestingly, the original work on Information theory did not use 'bits' because binary computers were in their infancy. Shannon started off using Morse Code as a model.

That article links to another on the "hartley" as a unit of information, relevant within the domain of information theory & application of that theory: https://en.m.wikipedia.org/wiki/Hartley_(unit)

To quote from the article:

The hartley (symbol Hart), also called a ban, or a dit (short for decimal digit), is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit, or shannon. One ban or hartley is the information content of an event if the probability of that event occurring is 1/10.[1] It is therefore equal to the information contained in one decimal digit (or dit), assuming a prioriequiprobability of each possible value . . .

Though not an SI unit, the hartley is part of the International System of Quantities, defined by International Standard IEC 80000-13 of the International Electrotechnical Commission. It is named after Ralph Hartley.
My math knowledge approaches the limit 0, but judging by the mention of certain terms, the above sounds related to what @jbriggs444 has suggested - yes? (EDIT - see @anorlunda's comment, below.) And note also the reference in the first paragraph to the bit, or Shannon, already mentioned.

More generally, getting back to the OP's question, surely a unit implies a context; and so a question for @Nim might be, did you have in mind any particular context (e.g. something other than information theory, digital communications, etc.)?

Last edited:
In addition to the bit, there is the qubit as used in quantum computers. A qubit is a lot moe information than a bit but precisely how much is uncertain

@jbriggs444 's definition (which relates the the number of micro states) works in thermodynamics. But information is also related to unitarity (quantum physics), Liouville's theorum (classical physics), Shannon's information theory (as @sophiecentaur mentioned), to the rule that the sum of all possibilities in a physical system is identically one, and the reversibility of physical laws at the micro level that Leonard Susskind calls the minus first law of physics.

I have yet to hear a definition of information that applies in all contexts.

Information is closely related to (but not identical to) entropy. Wikipedia's disambiguation page for entropy links to 16 technical definitions of the word. I expect the same difficulty defining information.

Information is synonymous with knowledge in natural language and in the dictionary, but not in physics. That greatly adds to the confusion because people think knowledge when they hear information.

UsableThought
Information is closely related to (but not identical to) entropy.

With regards to this, I found a couple of interesting posts; the author is Tom Schnieder, a cancer researcher with NIH, and the posts seem to be part of a FAQ for the news group bionet.info-theory, a forum for discussing information theory in biology.

What's interesting is that rather than entropy, apparently Shannon preferred to speak of "conditional entropy", which relates to information reducing uncertainty - a rather different concept than entropy per se. I apologize for not being able to vet this content myself due to its math &a conceptual content, but wonder if it might be of interest nonetheless:

https://schneider.ncifcrf.gov/information.is.not.uncertainty.html

https://schneider.ncifcrf.gov/bionet.info-theory.faq.html#Information.Equal.Entropy

Last edited:
With regards to this, I found a couple of informational posts; the author is Tom Schnieder, a cancer researcher with NIH, and the posts seem to be part of a FAQ for the news group bionet.info-theory, a forum for discussing information theory in biology. Anyway what's interesting is that he contends entropy is not the proper term to use in relation to Shannon's thoughts on information, and that a better term is "uncertainty."

I read recently on PF an account about Shannon's choice of words. He didn't think of entropy, but a physicist at Bell Labs advised him to use entropy, because of the way it was defined. I apologize for not having a link to the thread or to the actual source of that conversation.

Uncertainty is a word that carries lots of other baggage.

Uncertainty is a word that carries lots of other baggage.

It certainly does. Apropos of that, I am slowly making my way through Willful Ignorance, by Herbert Weisberg, which deals in part with the emergence of classical and modern probability & the shedding along the way of much older notions of the nature of uncertainty, including questions of causation, morality, etc.

Regarding Shannon choosing to use "entropy", here is a link to a short bio of engineer Myron Tribus, in which he is quoted as saying Shannon told him Von Neumann was the one who suggested this: http://www.eoht.info/m/page/Myron+Tribus . Tribus wrote up the encounter in a 1971 Sci.Am. article as follows:

What’s in a name? In the case of Shannon’s measure the naming was not accidental. In 1961 one of us (Tribus) asked Shannon what he had thought about when he had finally confirmed his famous measure. Shannon replied: ‘My greatest concern was what to call it. I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name. In the second place, and more importantly, no one knows what entropy really is, so in a debate you will always have the advantage.”​

Last edited:
Information is a dimensionless quantity: the negative log of the probability of occurrence. The question of units comes down to the question of what base to use when taking the log. Euler's number is a good choice. Base 2 (i.e. the bit) is another viable choice.

I have the seen the word "nit" for the natural log one.

More generally, getting back to the OP's question, surely a unit implies a context; and so a question for @Nim might be, did you have in mind any particular context (e.g. something other than information theory, digital communications, etc.)?

I was mainly thinking about data storage and transfer rates, these are common categories in unit converters. I thought it was strange that there was no official SI unit of measurement for it since it's something so often measured.

Also, I am adding dimensional analysis to a calculator I am working on, which can already add and convert units among many other things (e.g. "meter + 2 feet" = "1.609599 meter"). I've found that the 7 SI dimensions are inadequate for this purpose, I have added 4 more: money, angle, solid angle, and information. I refer to these 11 values as the "fingerprint" programmatically, but it's used in the same way as dimensions.

Last edited:
I was mainly thinking about data storage and transfer rates, these are common categories in unit converters. I thought it was strange that there was no official SI unit of measurement for it since it's something so often measured.

Also, I am adding dimensional analysis to a calculator I am working on, which can already add and convert units among many other things (e.g. "meter + 2 feet" = "1.609599 meter"). I've found that the 7 SI dimensions are inadequate for this purpose, I have added 4 more: money, angle, solid angle, and information. I refer to these 11 values as the "fingerprint" programmatically, but it's used in the same way as dimensions.

In that case, I think the most common usage "bit" would be best.

That still leaves lots of uncertainty in use. Suppose I store a 140 character message on a 100GB disc. How much information does it hold, 140B or 100GB? Reasonable people will disagree on the meaning.

Species diversity https://en.wikipedia.org/wiki/Species_diversity in a system - the concept is not simple to define because of the breadth of ecosystems. But researchers use a representation of the system - see the section in the link about Diversity Indices: Shannon, Simpson, Simpson-Gini.

The Shannon index measures information in a dataset usually read streamwise. Sampling for this is done from randomly placed transects in the area of study.
You start at one end of the transect, walk along, record what you found.

The Shannon index can be viewed as using bits, in the sense of the process of sampling binary: zero or one. I found tree species X, the next tree in the sample is either the same species X (bit set off) or not the same species X (bit set on). It simply cares about comparing what you have in hand to what you just had in hand a moment ago. Since H (Shannon Index) is a measure of information and so can be said to use bits.

IMO, to have some kind of SI unit for information is not necessary. Because in the rule above the "bits" (tree species) lose their identity. At what point did you actually lose information or gain new information?

Bob: What tree species did you find?
Mary: I do not know from the Shannon Index, but it was 0.933!

So in the process of codifying you actually lose other kinds of information. The process is not reversible. Which is completely acceptable as a result of the definition of the indices.

It seems to me that we can have a unit for information despite not having an agreed-upon definition for "information". To take @anorlunda's example, we can have a 100 GB (800,000,000,000 bits) disk containing a 140 B (1120 bits) message which is one of two equiprobable messages that could have been stored (1 bit) without ever disagreeing about the units of measurement.

Edit: forgot that disk Gig's are decimal

It seems to me that we can have a unit for information despite not having an agreed-upon definition for "information". To take @anorlunda's example, we can have a 100 GB (800,000,000,000 bits) disk containing a 140 B (1120 bits) message which is one of two equiprobable messages that could have been stored (1 bit) without ever disagreeing about the units of measurement.

Edit: forgot that disk Gig's are decimal
It's not really as simple as that because suitable Coding can increase greatly the amount of information that can be stored in a given number of bits. 'Lossless Coding' of sound is a good example of how the total number of bits can be reduced without the quality suffering. (I know MPEG coding destroys information but that's not what I am talking about)

It's not really as simple as that because suitable Coding can increase greatly the amount of information that can be stored in a given number of bits. 'Lossless Coding' of sound is a good example of how the total number of bits can be reduced without the quality suffering. (I know MPEG coding destroys information but that's not what I am talking about)
Indeed, not simple. The distinction I am trying to make is between a unit of information (which I think we can agree upon) and a quantity of information in various situations (which has room for different interpretations).

As a [poor] analogy, we can agree on the meter as a unit of measurement even if we cannot agree on the height in meters to which the atmosphere extends.

a unit of information
The bit / byte / mB / TB are what we use to describe the basic hardware capacity. And a good big'n will usually beat a good little'un. But that's not good enough even to discuss how much music you can get on your iPhone. A moderately good big'n will not beat a very good little'n that was designed in the last couple of years.

That's only a binary digit. There's nothing fundamental about binary coding of information. Standard computer memory capacity can be described in numbers of bits and the bit is useful as a comparison unit but the possible information content is way beyond the memory capacity.
This wiki link will give you some background to Information Theory. Interestingly, the original work on Information theory did not use 'bits' because binary computers were in their infancy. Shannon started off using Morse Code as a model.
I agree that one can have a digit carrying more information than a bit, such as a decimal digit, but the smallest amount of information is surely just 1 or 0. I also agree that a single bit can carry more information by means of its actual amplitude, but if it is only just discernible then surely it carries the minimum possible information.
Morse code seems to be a 3 level code - dot, dash and spacer.

I agree that one can have a digit carrying more information than a bit, such as a decimal digit, but the smallest amount of information is surely just 1 or 0. I also agree that a single bit can carry more information by means of its actual amplitude, but if it is only just discernible then surely it carries the minimum possible information.
Morse code seems to be a 3 level code - dot, dash and spacer.
Channel capacity can be less than one bit per symbol. http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf

One of the keystones of Shannon's theory is the distinction between the information content itself, and its representation. Even though you might be sending a signal as a sequence of 0s and 1s, that does not mean each digit's information content is also 1 bit. As an extreme example, consider a channel that only ever sends 0s. Since the probability of a 0 is 1.0, according to p*ln(p), its information content is actually 0, and that intuitively makes sense as well. On the receiving end you learn nothing of value when all you get is 0s. Conversely, imagine suddenly a stray 1 appears, that clearly is an event of note! And thus it carries information according to the same formula.

Interestingly, the maximum information is transmitted when all values have the same probability. In other words, when the signal is white noise. That's actually one of the side effects of compression, that you transforming the signal towards white noise as possible.

sophiecentaur
I also agree that a single bit can carry more information by means of its actual amplitude,
That would not be a 'Binary digIT', though, would it? Binary means just two possible states. You are referring to the height of an analogue pulse or one with more than two discrete states. My point is that n multiple bits can, between them, carry much more than n bits of information. They can have 2n states. Any multiple arrangement of digital (binary, ternary, quaternary etc) can be combined to carry much more information (Factorially big)
So a unit for Information is a bit like how long is a piece of string. It has to be tailored to the application, I think - as the Baud was used for old forms of digital comms.