Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

I Why Is There No SI Unit For Information?

  1. Jul 13, 2017 #1

    Nim

    User Avatar

    Why is the quantity "information" missing from the SI system?

    Also, if they did add it, do you think it would be added as a base unit or dimensionless derived unit?
     
  2. jcsd
  3. Jul 13, 2017 #2

    tech99

    User Avatar
    Gold Member

    I think the unit of information is the Bit.
     
  4. Jul 13, 2017 #3
    I don't think it is "missing", all scientifically pertinent information is standardized as SI values.
     
  5. Jul 13, 2017 #4

    jbriggs444

    User Avatar
    Science Advisor

    Information is a dimensionless quantity: the negative log of the probability of occurrence. The question of units comes down to the question of what base to use when taking the log. Euler's number is a good choice. Base 2 (i.e. the bit) is another viable choice.
     
  6. Jul 13, 2017 #5

    sophiecentaur

    User Avatar
    Science Advisor
    Gold Member

    That's only a binary digit. There's nothing fundamental about binary coding of information. Standard computer memory capacity can be described in numbers of bits and the bit is useful as a comparison unit but the possible information content is way beyond the memory capacity.
    This wiki link will give you some background to Information Theory. Interestingly, the original work on Information theory did not use 'bits' because binary computers were in their infancy. Shannon started off using Morse Code as a model.
     
  7. Jul 13, 2017 #6
    That article links to another on the "hartley" as a unit of information, relevant within the domain of information theory & application of that theory: https://en.m.wikipedia.org/wiki/Hartley_(unit)

    To quote from the article:

    The hartley (symbol Hart), also called a ban, or a dit (short for decimal digit), is a logarithmic unit which measures information or entropy, based on base 10 logarithms and powers of 10, rather than the powers of 2 and base 2 logarithms which define the bit, or shannon. One ban or hartley is the information content of an event if the probability of that event occurring is 1/10.[1] It is therefore equal to the information contained in one decimal digit (or dit), assuming a prioriequiprobability of each possible value . . .

    Though not an SI unit, the hartley is part of the International System of Quantities, defined by International Standard IEC 80000-13 of the International Electrotechnical Commission. It is named after Ralph Hartley.
    My math knowledge approaches the limit 0, but judging by the mention of certain terms, the above sounds related to what @jbriggs444 has suggested - yes? (EDIT - see @anorlunda's comment, below.) And note also the reference in the first paragraph to the bit, or Shannon, already mentioned.

    More generally, getting back to the OP's question, surely a unit implies a context; and so a question for @Nim might be, did you have in mind any particular context (e.g. something other than information theory, digital communications, etc.)?
     
    Last edited: Jul 13, 2017
  8. Jul 13, 2017 #7

    anorlunda

    User Avatar
    Science Advisor
    Gold Member

    In addition to the bit, there is the qubit as used in quantum computers. A qubit is a lot moe information than a bit but precisely how much is uncertain :-p

    @jbriggs444 's definition (which relates the the number of micro states) works in thermodynamics. But information is also related to unitarity (quantum physics), Liouville's theorum (classical physics), Shannon's information theory (as @sophiecentaur mentioned), to the rule that the sum of all possibilities in a physical system is identically one, and the reversibility of physical laws at the micro level that Leonard Susskind calls the minus first law of physics.

    I have yet to hear a definition of information that applies in all contexts.

    Information is closely related to (but not identical to) entropy. Wikipedia's disambiguation page for entropy links to 16 technical definitions of the word. I expect the same difficulty defining information.

    Information is synonymous with knowledge in natural language and in the dictionary, but not in physics. That greatly adds to the confusion because people think knowledge when they hear information.
     
  9. Jul 13, 2017 #8
    With regards to this, I found a couple of interesting posts; the author is Tom Schnieder, a cancer researcher with NIH, and the posts seem to be part of a FAQ for the news group bionet.info-theory, a forum for discussing information theory in biology.

    What's interesting is that rather than entropy, apparently Shannon preferred to speak of "conditional entropy", which relates to information reducing uncertainty - a rather different concept than entropy per se. I apologize for not being able to vet this content myself due to its math &a conceptual content, but wonder if it might be of interest nonetheless:

    https://schneider.ncifcrf.gov/information.is.not.uncertainty.html

    https://schneider.ncifcrf.gov/bionet.info-theory.faq.html#Information.Equal.Entropy
     
    Last edited: Jul 13, 2017
  10. Jul 13, 2017 #9

    anorlunda

    User Avatar
    Science Advisor
    Gold Member

    I read recently on PF an account about Shannon's choice of words. He didn't think of entropy, but a physicist at Bell Labs advised him to use entropy, because of the way it was defined. I apologize for not having a link to the thread or to the actual source of that conversation.

    Uncertainty is a word that carries lots of other baggage.
     
  11. Jul 13, 2017 #10
    It certainly does. Apropos of that, I am slowly making my way through Willful Ignorance, by Herbert Weisberg, which deals in part with the emergence of classical and modern probability & the shedding along the way of much older notions of the nature of uncertainty, including questions of causation, morality, etc.

    Regarding Shannon choosing to use "entropy", here is a link to a short bio of engineer Myron Tribus, in which he is quoted as saying Shannon told him Von Neumann was the one who suggested this: http://www.eoht.info/m/page/Myron+Tribus . Tribus wrote up the encounter in a 1971 Sci.Am. article as follows:

    What’s in a name? In the case of Shannon’s measure the naming was not accidental. In 1961 one of us (Tribus) asked Shannon what he had thought about when he had finally confirmed his famous measure. Shannon replied: ‘My greatest concern was what to call it. I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. When I discussed it with John von Neumann, he had a better idea. Von Neumann told me, ‘You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name. In the second place, and more importantly, no one knows what entropy really is, so in a debate you will always have the advantage.”​
     
    Last edited: Jul 13, 2017
  12. Jul 17, 2017 #11
    I have the seen the word "nit" for the natural log one.
     
  13. Aug 7, 2017 #12

    Nim

    User Avatar

    I was mainly thinking about data storage and transfer rates, these are common categories in unit converters. I thought it was strange that there was no official SI unit of measurement for it since it's something so often measured.

    Also, I am adding dimensional analysis to a calculator I am working on, which can already add and convert units among many other things (e.g. "meter + 2 feet" = "1.609599 meter"). I've found that the 7 SI dimensions are inadequate for this purpose, I have added 4 more: money, angle, solid angle, and information. I refer to these 11 values as the "fingerprint" programmatically, but it's used in the same way as dimensions.
     
    Last edited: Aug 7, 2017
  14. Aug 7, 2017 #13

    anorlunda

    User Avatar
    Science Advisor
    Gold Member

    In that case, I think the most common usage "bit" would be best.

    That still leaves lots of uncertainty in use. Suppose I store a 140 character message on a 100GB disc. How much information does it hold, 140B or 100GB? Reasonable people will disagree on the meaning.
     
  15. Aug 7, 2017 #14

    jim mcnamara

    User Avatar

    Staff: Mentor

    Species diversity https://en.wikipedia.org/wiki/Species_diversity in a system - the concept is not simple to define because of the breadth of ecosystems. But researchers use a representation of the system - see the section in the link about Diversity Indices: Shannon, Simpson, Simpson-Gini.

    The Shannon index measures information in a dataset usually read streamwise. Sampling for this is done from randomly placed transects in the area of study.
    You start at one end of the transect, walk along, record what you found.

    The Shannon index can be viewed as using bits, in the sense of the process of sampling binary: zero or one. I found tree species X, the next tree in the sample is either the same species X (bit set off) or not the same species X (bit set on). It simply cares about comparing what you have in hand to what you just had in hand a moment ago. Since H (Shannon Index) is a measure of information and so can be said to use bits.

    IMO, to have some kind of SI unit for information is not necessary. Because in the rule above the "bits" (tree species) lose their identity. At what point did you actually lose information or gain new information?

    Bob: What tree species did you find?
    Mary: I do not know from the Shannon Index, but it was 0.933!

    So in the process of codifying you actually lose other kinds of information. The process is not reversible. Which is completely acceptable as a result of the definition of the indices.
     
  16. Aug 7, 2017 #15

    jbriggs444

    User Avatar
    Science Advisor

    It seems to me that we can have a unit for information despite not having an agreed-upon definition for "information". To take @anorlunda's example, we can have a 100 GB (800,000,000,000 bits) disk containing a 140 B (1120 bits) message which is one of two equiprobable messages that could have been stored (1 bit) without ever disagreeing about the units of measurement.

    Edit: forgot that disk Gig's are decimal
     
  17. Aug 7, 2017 #16

    sophiecentaur

    User Avatar
    Science Advisor
    Gold Member

    It's not really as simple as that because suitable Coding can increase greatly the amount of information that can be stored in a given number of bits. 'Lossless Coding' of sound is a good example of how the total number of bits can be reduced without the quality suffering. (I know MPEG coding destroys information but that's not what I am talking about)
     
  18. Aug 7, 2017 #17

    jbriggs444

    User Avatar
    Science Advisor

    Indeed, not simple. The distinction I am trying to make is between a unit of information (which I think we can agree upon) and a quantity of information in various situations (which has room for different interpretations).

    As a [poor] analogy, we can agree on the meter as a unit of measurement even if we cannot agree on the height in meters to which the atmosphere extends.
     
  19. Aug 7, 2017 #18

    sophiecentaur

    User Avatar
    Science Advisor
    Gold Member

    The bit / byte / mB / TB are what we use to describe the basic hardware capacity. And a good big'n will usually beat a good little'un. But that's not good enough even to discuss how much music you can get on your iPhone. A moderately good big'n will not beat a very good little'n that was designed in the last couple of years.
     
  20. Aug 7, 2017 #19

    tech99

    User Avatar
    Gold Member

    I agree that one can have a digit carrying more information than a bit, such as a decimal digit, but the smallest amount of information is surely just 1 or 0. I also agree that a single bit can carry more information by means of its actual amplitude, but if it is only just discernible then surely it carries the minimum possible information.
    Morse code seems to be a 3 level code - dot, dash and spacer.
     
  21. Aug 7, 2017 #20

    jbriggs444

    User Avatar
    Science Advisor

    Channel capacity can be less than one bit per symbol. http://math.harvard.edu/~ctm/home/text/others/shannon/entropy/entropy.pdf
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: Why Is There No SI Unit For Information?
  1. Si units (Replies: 1)

  2. Ln() & SI Units (Replies: 6)

Loading...