Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Decision making based on entropy reduction

  1. Aug 10, 2012 #1
    Hello guys,

    I just registered on this forum, so I hope you can help me with some differential entropy stuff.
    So I have made the attached pdf file (it's a bit messy, sorry) that demonstrates a toy version of the problem I'm trying to solve.
    Thing is I'm quite new to information theory, so I will be thankful if you can tell me if my approach is correct. Or if there is a better way of tackling the problem.

    Thank you

    Attached Files:

  2. jcsd
  3. Aug 10, 2012 #2

    Stephen Tashi

    User Avatar
    Science Advisor

    Does the change in entropy in a non-empty cell depend on what value is reported for that cell in a given measurement? Or does it merely depend on the fact that a measurement was made?

    A "simpler" toy version of this problem might be this: There is some function f(i,n) defined on each non-empty cell i where n = the number of measurments made in that cell. There is another function G(...) which is a function only of the current values of each of the f(i,n). The problem is how to "best" move the window to , say, minimize G.

    It isn't clear to me if you define the "best" move to be the one that most minimizes G or whether a given move is evaluated according to its place in some longer term strategy- like minimize G as much as possible with 5 moves.
  4. Aug 11, 2012 #3
    Thank you Stephen for your reply.
    you are right, the change in entropy depends only on the fact that a measurement was made.
    and by best, I mean something that minimises G (as in your reply) and I'm not looking into future steps.
    I will appreciate if you can comment on the validity of the proposed solution.
  5. Aug 11, 2012 #4


    User Avatar
    Science Advisor

    Hey farhadb.

    You have to be careful about using differential entropy as a measure for "true" entropy.

    The natural entropy is something that is associated with information and information is quantized: the entropy formula given by Shannon for the true entropy H(X) of a discrete random variable is the "real" entropy.

    The differential entropy refers to something completely different (although there is a connection between the two). First of all "natural" entropy must be >= 0 but differential entropy can give you "negative" results for entropy.

    The differential entropy is used in the context of noise modeling, but again its not the same as natural entropy and it doesn't even have the properties that entropy should have (like gauranteed to be >= 0 for instance).

    If you are minimizing the entropy for a data set or some piece of information that has a finite alphabet: use the normal version or consider how the thing should be quantized.
  6. Aug 11, 2012 #5

    Stephen Tashi

    User Avatar
    Science Advisor


    I don't understand if you are assuming that we know before moving the window whether a particular cell outside the window is empty or not. The way you phrased your solution, we know this in advance of moving the window.
  7. Aug 12, 2012 #6
    @chiro: Thanks for the reply. In my problem (I think) I have to deal with both discrete (the cell being empty or not) and continuous (a real number between zero and one) cases. so natural entropy alone can not help. and I think differential entropy has many of nice characteristics of natural entropy, e.g. being non-negative, am I right?

    @stephen: We have no prior knowledge about a cell before a measurement is done. So at time t, we know a cell outside window is empty or not only if at some time t' (t' < t) that cell has been met by the measuring window. Otherwise there is no way to say if the cell is empty or not. Thanks
  8. Aug 12, 2012 #7


    User Avatar
    Science Advisor

    Differential entropy is not the same and isn't gauranteed to either be finite or be positive. It does not have the same meaning as a normal measure of entropy (but that's not to say it doesn't have its uses).

    Entropy doesn't make sense in the natural sense if its negative or infinite. You can't have negative information content and you can't have infinite information content: information has to exist and be finite.

    You have to be careful about extrapolating the properties of differential entropy to natural entropy. Take a look at this to see instances of this:


    Also as an exercise, try and calculate the entropy of a uniform distribution (and size interval as long as it's finite). The result should show you something (the entropy should be infinite if the calculation was done correctly).
  9. Aug 12, 2012 #8

    Stephen Tashi

    User Avatar
    Science Advisor

    I think you should clarify these various issues and re-state your problem. Also you need to explain why your solution works. (You can read about using LaTex in forum posts at https://www.physicsforums.com/showthread.php?p=3977517&posted=1#post3977517)

    As far as I can tell, you calculate a quantity (an entropy) produced by a proposed move of the window without knowing which cells are null. You use this to predict the best move. Then you move the window and an update produces a different answer for this quantity because you definitely know which cells are empty. So it seems to me you are not dealing with a deterministic minimization problem. This has a stochastic aspect to it. The predicted best move might turn out not to be the best move.
  10. Aug 12, 2012 #9

    I think I had misused the term "differential entropy". sorry my bad. What I really meant was the difference between two differential entropies. I think this has some of the nice characteristics of natural entropy and can be considered a measure of information (am I right?)
    To clarify a bit more; there is a random (continuous) variable and we have some a priori knowledge about it. So we have a probability distribution based on that knowledge. Now I do a measuring and based on this I refine the probability distribution. Now I calculate the difference between the differential entropy of each of those two probability distribution, and I assume this number tells me the amount of information gained.
    The pdf I had attached to the first post might help illustrating the idea.
    Thanks again Chiro for walking me through this.
  11. Aug 12, 2012 #10
    You are absolutely right. Basically this is a search problem. At each time instant and based on what I have seen from the world, the big grid in this case, I consider all the possible moves and choose the one that I predict will give me most information. Hoping for a more efficient search. After actually performing the move and making new measurements, the amount of information gained will most probably be different from what was predicted. But I guess this is the best one can do about this search problem. and these steps will be repeated until system is exhausted or some other criteria are met.
    If you know a more efficient way of solving this problem, I'd love to hear that.
    And do you think I am calculating the information gain correctly in my solution? and please let me know if there is any other part that's not clear, I know I am not doing a great job explaining the problem.
    Thanks for your time, I appreciate it.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook