Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Information theory

  1. Sep 9, 2007 #1

    Chronos

    User Avatar
    Science Advisor
    Gold Member
    2015 Award

    Is information theory the crucible of astrophysics in the 21st century? The number and quality of cosmologically related IT papers in the past year is impressive. I am admittedly swayed by this approach. For example:

    arXiv:0708.2837
    The Physics of Information
    Authors: F. Alexander Bais, J. Doyne Farmer
     
  2. jcsd
  3. Sep 9, 2007 #2

    Fra

    User Avatar

    In my personal opinion information theoretical and game theoretical views on physics is definitely the what I am convinced to have huge potential, not specifically in astrophysics but to fundamental physics in general.

    Though there seems to be several different approaches to "information physics", some doesn't appeal to me, while others do.

    /Fredrik
     
  4. Sep 10, 2007 #3

    Chronos

    User Avatar
    Science Advisor
    Gold Member
    2015 Award

    Someday we will toast IT as the road to reality - IMO. The methodology is powerful.
     
  5. Sep 10, 2007 #4

    Fra

    User Avatar

    I am often accused for beeing "philosophical" but I personally put large emphasis on strategy and coherence of reasoning. I want to see an implementation of the scientific method in our formal frameworks to a larger extent. In this quest, the information approaches seems to me to be the natural approach of choice in this spirit. I think it will revolutionize not just specific theories but more importantly the overall strategy of theoretical physics as a strategy.

    /Fredrik
     
  6. Sep 10, 2007 #5
    I also see information theory as holding a key to the fundamental structure at all levels. There is some complexity to any structure, I assume, and so it must take some information to describe that structure, even at the smallest level. My question is what is the most basic definition of information? And how does it relate to the basic parameters of space and time. Does the curvature of spacetime itself have structure and therefore information?
     
  7. Sep 10, 2007 #6
    This perspective appeals to me. As I understand it, information is a relation about probabilities. Probabilities come about ONLY when you talk about the number of different ways that a result can occur - from an experiement, or a solution of an equation. And since we are talking about mathematical relations that describe physical events, we can narrow our concerns to multiple solutions of the same mathematical problem (assuming a mathematical description can be found). I wonder if it's always the case that if there are multiple solutions, then there is always an underlying symmetry group involved. Then information/entropy would always be a measure of the size of some underlying symmetry group. What do you think?
     
    Last edited: Sep 10, 2007
  8. Sep 10, 2007 #7
    A group G acting on some set X "induces a notion of 'geometry' on X"... Wow... That says a lot. If entropy/information is equated to the size of groups, and geometry is deduced from groups, then can this be used to equate entropy/information to gravitation/geometry? Have you seen any papers on this?

    The math you use is over my head, sorry. I'm still trying to justify why I should study the subject. It looks as though this all is becoming relevant. Thanks.
     
  9. Sep 10, 2007 #8

    Chris Hillman

    User Avatar
    Science Advisor

    I actually said the exact opposite of what you thought I said!

    See for example Ken Brown, "What is a building?", Notices of the Amer. Math. Soc. 49 (2002), no. 10, 1244--1245. http://www.ams.org/notices/200210/what-is.pdf (Be warned that you'll need a fairly solid background in math to understand the question in the title.)

    The whole point of what I said about complexions was that "information" need not be measured by a number. I said that the entropies in that theory are the dimensions of the complexions (in the case when G is a Lie group acting by diffeomorphisms on a smooth manifold X).

    The whole point of what I said about learning about IT was to be very careful to avoid glib identifications, interpretations, and so on. These are likely to be terribly misleading or flat out wrong.

    I most certainly did not say that "entropy" can be "equated" with "gravitation".

    Information theory has always been relevant. The whole point of my posts was that anyone interested in new colonies of the IT empire, so to speak, should spend the time and energy to learn about the great cities of IT, since you can't understand IT without knowing something about what the greatest information theories (particularly Shannon's theory) actually say.

    On reflection I deleted my posts since I'd rather remain silent than to be so badly misunderstood. However, I take the point that it was not your fault that you misunderstood me: I am not familiar with this subforum, and I mistakenly assumed a higher ambient level of mathematical sophistication. Sorry I confused you!

    I stand by my main point, that "information theory meets gravitation" is probably the trickiest area in all of contemporary physics, and to warn that speculations which are not grounded in exceptionally solid and wide ranging background in mathematics, physics, and philosophy are very unlikely to be of any value to serious students of physics.
     
    Last edited: Sep 10, 2007
  10. Sep 11, 2007 #9

    Chronos

    User Avatar
    Science Advisor
    Gold Member
    2015 Award

    I agree that confirmation of both what is, and is not observed is the bar for any ATOE [almost theory of everything!] like quantum gravity. IT is not unlike string theory - unphysical [or at least unobservable] artifacts abound. It is, however, fascinating to examine models that approximate observational limits. It is inevitable that one such model will be at least functionally correct. The devil, of course, is in the details. Also not unlike the Anthropic Principle, IT is a useful tool for detecting naked emperors.
     
  11. Sep 11, 2007 #10

    jal

    User Avatar

    But ... I already read them...and got full of salt
     
  12. Sep 11, 2007 #11

    Fra

    User Avatar

    Chris, thanks for participating in this thread! I think it was a pity (even in despite of our ignorance) to delete your responses, which also contained some nice references, fortunately I was at least able to read your shortlived posts once last night :)

    Like Chris wrote in the deleted posts there are many approaches to the topic, and I am not an expert in any particular topic I just try to follow my own strategy to answer my questions, I like to learn what's relevant along the way. It sure is easy to diverge away in your own thoughts but I find it similarly risky to be persuade by existing formalisms. I try to find the balance to both stay on track but not waste the intrinsic motivation and creativity.

    Chris has commented on this already but here as some more comments, the way I see it, without implications on how others see it.

    Most commonly information is defined in terms of entropy. Entropy is usually regarded as a qualitative measure of missing information. The problem is then of course, what is entropy? And there are several different definitions of entropy, yielding the information concept somewhat arbitrary.

    There are some axioms one can use to derive particular entropies, but then again, what's up with those axioms? I personally try another route. If you treat the entropy right, i think it will to a certain extent no matter as much wich version you use, since the absolute entropy is not as interesting, it's more the dynamics of the entropy that is interesting I think.

    For my own thinking I am still pondering and working on it, but I try to work in a probabilistic framework, without explicitly defining the entropy. The interesting part to me is stuff like the transition probabilities, which are certainly related to relative entropies, and there are also similarities to the "action". It bears resemblance to the feymann path integral but I am looking for something more explicit.

    I am doing this entirely on a hobby basis, and don't have much time, so things work slow.

    Ariel Caticha, has written some interesting papers where his idea is to define the distance measure of space in terms of the probability of mixing up two space points. A kind of information geometry. He also tries to elaborate on the dynamics by entropy dynamics. He has some ideas to derive GR from principles of inductive inference. From what I know he hasn't succeeded yet. I like all his papers, though there are specific things he do that I personally do not find satisfactory. This has exactly to do with the entropy stuff.

    Check out his http://arxiv.org/PS_cache/gr-qc/pdf/0301/0301061v1.pdf but also all his papers are interesting. But the details are thin as it's clear that the approach is young and most of hte owrk is left to be done.

    What I miss in this papers is a fundamental account for mass an energy. My intuitive idea is that the obvious similariy in interial and gravity phenomena and the intertial phenomena in information world are probably too similar to be a conicidence. I am stll trying to find the proper formalism for it, but the first loose association is with intertial energy or mass and information capacity. I've also got some ideas how learning mechanics gives rises to self organised structures.

    I personally have started my own elaborations so far from the basic concept of distinguishability. This is closely related to Ariels approaches. This means you start with a basic boolean observable. I have had some headache over this but I have a hard time to figure out an simpler and a more plausible starting point. Then, given that there is an observer with some memory capacity, relations can be built during hte course of fluctuations. Here one can imagine several options and I'm still thinking. But one way is to consider that the natural extension to this boolean observable is an extension from {0,1} to {0,1,2} to {0,1,2,3} etc. Where 1 2 3 are simply labels and could as well be a b c etc. Ultimately we get a on dimensional "string" corresponding to continuum between the boolean 0 and the boolean 1. next this cna go on if the environment so calls for, to inflate the string in more dimensions... And meanwhile these structures must be correlated with the state of the observer. The complexity is constrained by the information cpacity of the observer.

    In this case I loosely use information capacity as a relative notion.

    I find the difficulty to be that in order to get this consitent the representations and the dynamics are related, and there seems hard to find objective hard references. I think of it as basically correlations between self organisting structures.

    The problem with information is also that it ultimately builds on probability theory, and the problem with that (at least my personal problem) is that any realistic model must infere the probability space from experiments, or experience. So the probability space is also constrained byt hte observers information capacity. Anything else seems like an unacceptable idealisation to me. This renders this notiosn aslo relative. Not only are probabilities relative in the bayesian sense, different views may also evaluate the probability space differently.

    I am hoping to produce a paper with this eventually but time is the problem. The good thing about the slow pace though, is that I have plenty of time to reflect properly over things and not just bury in mathematics.

    /Fredrik
     
  13. Sep 11, 2007 #12

    Fra

    User Avatar

    I forgot to wrote that what first seems as a problem, or circular reasoning gets a nice resolution, I think of it as requiring evolution to stay consistent. The notion of probability is strictly speaking uncertain, we can only find what we think is the probability. WE can measure the frequency, but what is the "true probability"? Like Chris note this brings you down to the axioms of probability and the interpretations. I see a need to tweak them. Meaning we can only get hte probabiliy of the probability, and not even thta! which means you end up with probability of probability of probability... which sort of makes no sense. I'm not sure if this is related to what chris refers to as the algorithmic entropy? (i forgot his wording as the post was deleted), any way... this seems to be the only CORRECT one... but then... it takes infinite time, memory and data to computet it! So it's useless.. the resolution is the evolutionary view... the drifting view... and the drifting rate is constrained by constraints on information capacity...

    /Fredrik
     
  14. Sep 11, 2007 #13
    Sorry you deleted your posts. You are not responsible for someone else's misunderstanding. Even if you were, these forums are not so formal (as peer reviewed forums are) that you should have to worry about it. I hope your disappointment with the ambient skill level will not deter you. We all appreciate your efforts. Thanks.

    My assumptions seem too esoteric to give up lightly, so let me reiterate with emphasis to see if understanding can be gained...

    First, let's restrict our conversations to mathematical models. I understand that entropy/information can be measured by observation. But I suspect that we will eventually find a mathematical model for everything, and we will want to also describe entropy in terms of that model.

    Now I suppose that some mathematical models may have multiple solutions. One question is if there are multiple solutions, then does this always imply an underlying symmetry group?

    Another question is can the alternatives always be normalized into a probability distribution for the various possibilities? Or is it more the case that just because there are alternatives doesn't mean we can know how probable one solution is over another? Or is there a natural measure of how likely one solution is in terms of how much of the underlying set is occupied by each solution?

    If information is so broady defined that it need not even be describable with a number, then would entropy be a more fitting term used to describe probability distributions?

    If alternatives can be normalized into a probability distribution, would that mean that the size of the underlying symmetry group relates to entropy? Or do we need more than just the size of the group to form a distribuition? And could the needed information be gotten also from group properties in order to form a distribution? Or is it more the case that knowing the symmetry and group properties of a solution space still may not be enough to form a distribution?

    You said, "Indeed, the mathematical context assumed in the theory of complexions is that some group G acts on some set X. (For concreteness, in this post I will consider left actions, but sometimes it is convenient to take right actions instead.) According to Klein and Tits, this induces a notion of "geometry" on X such that G (or rather a homomorphic image of G, if the action is not faithful) serves as the geometrical symmetry group of the "space" X." "

    This sounded intriging, of course, but perhaps I read more into that than warranted, my appologies. To me this seemed to contain the seeds for a generalized surface entropy formula for any underlying set X. My understanding of simplicial complexes is that they can approximate any manifold. Or am I misunderstanding your use of complexion? If I'm reading you right, it would seem to mean that we have a fundamental definition of entropy (maybe not information) for any mathematical model with symmetry properties. Does this sound right? Thanks.
     
    Last edited: Sep 11, 2007
  15. Sep 12, 2007 #14

    Fra

    User Avatar

    Mike, forgive me if I misinterpret you but some reflections of mine FWIW...

    IMO, these questions are good and interesting and they trace down to the fundamentals of probability theory in relation to reality.

    Since we are talking about reality and physics, rather than pure mathematics, the question is how to interpret and attache the notion of probability to reality in the first place. In QM the idealisation made is that we can know the probability exactly, but how do you actually make an observation of a probability? (and if you don't what's up with it?) The typical make an infinite measurement series and then the relative probability converges to the "true probability" is very very vauge IMO. Sometimes it makes sufficient sense for all practical purposes, like when big is effectively "close enough" to infinity and when the environment and experimental settings can be assume to not have change, which is in general the other major problem if you make an experiment that takes infinite time.

    So IMHO at least, the notion of information usually is related to the notion of entropy, which is related to the notion of probability. Which means the issues of information or missing information is ultimately rooted in the probability theory itself.

    Usually the entropy of a given probability distribution, can be directly conceptually associated with the probability of that specific distribution in the larger probabilityspce consisting of the space of all distributions.

    But the problems, are the choice of a prior, and the induction of the space of distributions in the first place. If they are given, the problem is easier. But in reality, these things aren't just "given" like that. This is also quite a philosophical problem.

    /Fredrik
     
  16. Sep 12, 2007 #15

    Fra

    User Avatar

    What I've personally tried to do, is to use pure combinatorics on information quanta (boolean states) to infere the probability of a given distribution (that is beeing built in a particular way to not loose relations to the first principles), without distracting the concepts with first defining an arbitrary entropy and then relation this entropy to a probability. Since the whole point of the entropy is to generate probabilities in the first place, I thought it would be cleaner to do that directly.

    My conceptual problem I'm struggling with is how the dynamical equations will be like, and exactly how to pull the time parameter out of this. My idea is that time is just a parametrisation of change, along the direction of the most probable change, and the unist beeing normalised by an arbitrary "reference change". And I expect that this will imply a builting bound on rate of changes and thus information propagation.

    /Fredrik
     
  17. Sep 13, 2007 #16

    Chronos

    User Avatar
    Science Advisor
    Gold Member
    2015 Award

    See the math forum for remarks by Chris. He gives a powerful and enlightening presentation. So, grab a drink, sit back, and enjoy. It is a refreshing and educational review of IT and its place in the cosmos.
     
  18. Sep 13, 2007 #17

    Fra

    User Avatar

    Thanks for the pointer Chronos. With all the subforums on here I had no idea. Due to time I never even looked into most subforums except a few.

    /Fredrik
     
  19. Sep 14, 2007 #18

    Chris Hillman

    User Avatar
    Science Advisor

    Musing on misunderstood warnings

    You are welcome, but do you understand why the following Washington Post article reminded me of this thread?

    http://www.washingtonpost.com/wp-dyn/content/article/2007/09/03/AR2007090300933_pf.html

    I really really hope that you all redeem yourself by reading carefully henceforth, and by studying some of the sources I cited---at the very least, Cover and Thomas and the on-line expository papers I cited. To repeat:

    http://www.math.uni-hamburg.de/home/gunesch/Entropy/shannon.ps
    http://www.math.uni-hamburg.de/home/gunesch/Entropy/entropy.html
    http://www.math.uni-hamburg.de/home/gunesch/Entropy/dynsys.html
     
    Last edited: Sep 14, 2007
  20. Sep 14, 2007 #19

    Chris Hillman

    User Avatar
    Science Advisor

    Danger Will Robinson!

    Some of this seems to reflect the myth (common among "armchair scientists") that delving into the literature will stifle creativity. The truth is quite the opposite. By studying good textbooks and expository papers about a field you would like to contribute to, you avoid repeating one after another common beginner's mistakes, which makes your progress toward attaining some level of mastery much more efficient. Furthermore, reading really good ideas from those who are already experts in the field makes it much more likely that your own creativity will lead to something genuinely novel and possibly interesting to others.

    Once again, one of the major points I tried to make is "it ain't neccessarily so". While at the same time urging you all to stop posting here and read Shannon 1948 and some other good sources of information about Shannon's information theory (the first, by far the most highly developed, and in many ways the most impressive, but nonetheless, not the most suitable for every phenomenon of possible interest in which "information" appears to play a role).

    That is only one of the issues I mentioned concerning "uncertainties of probability".

    You're doing it again. That is not what I said!

    Too esoteric to give up lightly?

    That is not what I said!

    I trust you mean "model of something other than communication" (or "information"). Or perhaps "theory" of something?

    One of my points was that Shannon 1948 is "the very model of a mathematical theory". Therefore, it behooves anyone seeking to build a mathematical theory of anything to learn what Shannon did.

    Are you sure you are not confusing mathematical model with field equation?

    The Schwarzschild perfect fluid matched to a Schwarzschild vacuum exterior is a mathematical model of an isolated nonrotating object, formulated in a certain physical theory, gtr. It is also a (global, exact) solution to the Einstein field equation which lies at the heart of that theory.

    The Markov chains discussed by Shannon in his 1948 paper form a sequence of mathematical models of natural language production. As he is careful to stress, this kind of model cannot possibly capture aspects of natural language other than statistics. His ultimate point is that the [i[mathematical theory[/i] he constructs, motivated by this sequence of Markov chains (which provide more and more accurate models of the purely statistical aspects of natural language production), it turns out that statistical structure is the only kind which is needed. Which is why I was careful to stress that in Shannon's theory, a nonzero mutual information between two Markov chains does not imply any direction of causality, only a statistical correlation in behavior.

    Can you clarify what you mean by "solution" and "model"?

    You should be able to answer that yourself, I think. (This comes up in any good textbook on quantum mechanics, for example.)

    I really, truly, deeply urge you to study Shannon 1948.

    http://www.math.uni-hamburg.de/home/gunesch/Entropy/infcode.html

    I didn't say that!

    I discussed a number of quite different theories of information. The whole point was that there are many quite different ways of defining notions of information. Some use very little structure (e.g. Boltzmann's theory), some require the presence of a probability measure (Shannon's theory) or a group action (theory of Planck's "complexions"). So I think you may be mixing up at least two theories and two or three levels of mathematical structure.

    In a situation in which the "entropies" defined in two or more information theories makes sense, because the requisite mathematical structure (probability, action) are present, it is reasonable to ask how these quantities are related. As I said, in general they are not numerically the same, but they may approximate each other or even approach each other in some limit.

    If you do the exercises I suggested, you should be able to answer your own question.

    Are you perhaps confusing the so-called holography principle with something you think I said?

    I said that whenever we have a group action, we have complexions, and these obey essentially the same formal properties as Shannon's entropies, in particular the quotient law. Thus, any structural invariant of these will also respect the formal properties of Shannon's entropies, and thus will admit an interpretation in terms of "information". I briefly mentioned two case in which such "Galois entropies" are obvious (actions on finite sets, and finite dimensional Lie groups of diffeomorphisms), but I did not imply that such quantities can be found for any group action whatever.

    (Regarding axiomatics: note that Shannon's statement of the formal properties he takes as axiomatic are given in the context of probability. This is why what I just said doesn't contradict his famous unicity theorem. The formal properties of which I speak can however be expressed in a more general context that probability theory, namely what I called (in "What is Information?") join-sets, a kind of weakening of lattice as in lattice theory.)
     
    Last edited: Sep 14, 2007
  21. Sep 14, 2007 #20
    I don't think anyone is going to write a book detailing all the mistakes others have made in physics. I wish they would. I wonder what the table of contents would look like.

    Foundational issues don't seem to be what most experts are interested in. I feel (probably like Fra) that it is too easy to get lost because too many trees obstruct the forrest.



    We all appreciate your efforts, and we consider your posts to have some authority. I think the problem is that we're trying desperately to simplify all the information you've given us. Thank you for keeping us on our toes.

    I wonder if the various kinds of information you've noted can be classified into two areas - one, that is based on probabilities understood by observation (how many faces of a die, or how many possible letters in a words), and two, based on probabilities determined from mathematical models, like the number of eigenstates or something like that? Or were you saying that some information is not based on probability at all? If not based on probability, are they all at least based on alternatives?

    But not the other way around - whenever we have Shannon type entropies we necessarily have group action? That is really the question I'm curious about.

    PS. I have read some of Shannon's work... years ago. At the time I found it facinating, and that's why I'm interested now.

    Thanks.
     
    Last edited: Sep 14, 2007
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?