1. Not finding help here? Sign up for a free 30min tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

The Missing information function

  1. Jan 18, 2015 #1

    CAF123

    User Avatar
    Gold Member

    1. The problem statement, all variables and given/known data

    Consider the missing information function ##S(\left\{p_i\right\}) = -k \sum_{i=1}^r p_i \ln p_i## where ##\left\{p_i\right\}## represents the probabilities associated with the ##r## mutually exclusive outcomes of some procedure.

    i) Sketch the form of S as a function of ##p_1## for ##r=2##
    ii) Show that ##S## has its max value when ##p_i = 1/r \forall i##
    iii) Show that the value of ##S## is increased when any two of the probablities are changed in such a way as to reduce their difference.

    2. Relevant equations

    3. The attempt at a solution

    In i), we take ##S = S(p_1, p_2)## and regard ##p_2## as a constant. Then ##p_1 \in [0,1]## is the domain and the plot of S versus p_1 looks like a skewed hump with max at ##p_1 = 1/e## and ##S=0## when ##p_1 = 0,1##. Did I interpret the question here right? I just don't see why we restrict ##r## to 2.

    ii) is fine I think. I get that ##p_j = 1/e## then ##\sum_{i=1}^r p_j = 1 \Rightarrow r (1/e) = 1 = r p_j \Rightarrow p_j = 1/r##. What is the significance of 1/e though? The result wouldn't change as long as it was a constant.

    In iii), not really sure where to start. I think the condition is that if ##|p_i + \delta - (p_j + \gamma)| < |p_i - p_j|## then ##dS > 0##, where ##\delta, \gamma## are real numbers. But not sure how to implement this.
     
  2. jcsd
  3. Jan 18, 2015 #2

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    (i) You cannot regard p2 as constant. Exactly one of the r outcomes always has to happen, so the sum over all pi has to be 1.
    (ii) I don't understand your argument, but the important part of the calculation is certainly missing.
    (iii) I would use this to prove (ii). Due to the fixed sum, you cannot introduce two different variables for the changes. Yes, the idea is to show that such a change increases the "missing information" with the formula you have.
     
  4. Jan 18, 2015 #3

    CAF123

    User Avatar
    Gold Member

    Yes, I understand, but the question asks to plot ##S## as a function of ##p_1##. If ##r=2##, then ##S = S(p_1, p_2)## which I am treating as a function of two variables. If I am to plot this as a function of ##p_1## only then via the constraint ##p_1 + p_2 = 1##, I can obtain ##S(p_1)##.
    I took the derivative as shown and found ##p_j = e^{-1}##. Then using the constraint ##\sum_{i=1}^r p_i = 1##. ##p_i## is a constant so the sum gives ##rp_i## and since this is equal to one I obtain ##p_i = 1/r##?
    Ok, so is the equation I wrote down correct just to check? I'm not sure how to implement this into the calculation.
    Thanks.
     
    Last edited: Jan 18, 2015
  5. Jan 18, 2015 #4

    TSny

    User Avatar
    Homework Helper
    Gold Member

    I'll try to help a little while mfb is away having fun. mfb was pointing out that you can't vary p1 without simultaneously varying p2 due to the constraint that the p's add to 1. The constraint reduces the number of independent p's from r to r-1. So, for the case of r = 2, you can think of S as a function of one variable, say p1. But then, p2 is now a function of p1.

    When finding the maximum of S, did you include the effect of the constraint? The lagrange multipler method might be helpful.

    For (iii), without loss of generality you may take pi to be the larger of {pi, pj}. Let ε be the change in pj. Is ε >0 or is ε<0? Can you express the change in pi also in terms of the same ε?
     
  6. Jan 18, 2015 #5

    CAF123

    User Avatar
    Gold Member

    Hi TSny,
    Ok so this produces a graph that is a symmetric hump with max at ##p_1 = 1/2## and hitting the ##p_1## axis at ##p_1 = 0,1##. I reexpress all instances of ##p_2## by ##1-p_1##.

    Oh yes, this gives ##p_j = \exp(-\lambda/k - 1) = \text{const}##, where ##\lambda## is the multiplier and then I input into constraint ##\sum_{i=1}^r p_i = 1##. This is $$\sum_{i=1}^r \exp(-1-\lambda/k) = \text{const} \cdot r = 1 \Rightarrow \text{\const} = 1/r = p_i$$

    I think ##\epsilon## can be greater or smaller than zero. If ##\epsilon < 0## then ##p_j' = p_j - \epsilon##. Then ##p_i' \in (p_i - \epsilon, p_j - \epsilon)## so as to reduce the difference between the two. If ##\epsilon > 0## then ##p_i' > p_i + \epsilon < p_j - \epsilon##
     
  7. Jan 18, 2015 #6

    TSny

    User Avatar
    Homework Helper
    Gold Member

    OK
    OK
    If pj is the smaller of {pi, pj}, and if ε is the amount that you change pj, then ε can have only one particular sign if you are going to reduce the separation between pj and pi.
    If you reduce pj and pi by the same amount, will the constraint condition still hold?
     
  8. Jan 18, 2015 #7

    CAF123

    User Avatar
    Gold Member

    I was thinking along the lines of if ##p_j## decreases by ##\epsilon## then ##p_i## must decrease by an amount greater than epsilon so that their difference is reduced. If p_j increased by epsilon then p_i can increase or decrease, with an increase not bigger than epsilon.
    No, that would mean their difference is not reduced. That is why I let the interval open, however I was taking p_j to be larger than p_i in that analysis (accidentally) contrary to the notation you gave.
     
  9. Jan 18, 2015 #8

    TSny

    User Avatar
    Homework Helper
    Gold Member

    Since the sum of the p's must always add to 1, what must happen to pi if you increase pj?
     
  10. Jan 19, 2015 #9

    CAF123

    User Avatar
    Gold Member

    Oh I see what you were getting at, yes sorry one must increase while the other decreases so as to not violate conservation of probability. So, if ##p_i > p_j## and we are to reduce the difference then ##p_j## must increase so that ##p_j' = p_j + \epsilon, \epsilon > 0## while ##p_i' = p_i - \epsilon##. Then ##\sum_k p_k = 1## still.

    So, ##S' - S =-k[ (p_j + \epsilon) \ln (p_j + \epsilon) + (p_i - \epsilon)\ln (p_i - \epsilon)]##. Since ##p_i > p_j, p_j + \epsilon > 0## and ##p_i - \epsilon \geq 0##. (i.e if we have initially, ##p_i = \epsilon, p_j = 0##, assuming we can attribute a zero probability to one of the ##p_k##'s) So then ##S'-S>0##. Is it fine?

    Maybe one thing I am not getting is why would the difference between the two probabilities change? If ##p_j' = p_j + \epsilon, p_i' = p_i - \epsilon## then one increases by some amount while the other decreases so the difference is the same.

    Thanks!
     
  11. Jan 19, 2015 #10

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Subtract them, and see how the difference differs from the previous difference?
    Alternatively, consider this numerical example: Increase 2 by 1 and decrease 7 by 1 and see if the difference changes.

    No you cannot.
     
  12. Jan 19, 2015 #11

    CAF123

    User Avatar
    Gold Member

    Yes, sorry I was being silly, at things for too long last night, I was too tired.

    Yes I thought so, otherwise there would not be ##r## possible outcomes. But I believe my argument before still works except now with ##p_i - \epsilon > 0##. Yes? Thanks!
     
  13. Jan 19, 2015 #12

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Where is your proof that S'-S>0 apart from the easy case where some parts are zero?
     
  14. Jan 20, 2015 #13

    CAF123

    User Avatar
    Gold Member

    ##S' = -k [p_1 \ln p_1 + \dots + (p_j + \epsilon)\ln (p_j + \epsilon) + \dots (p_i - \epsilon) \ln (p_i - \epsilon) + \dots ]## and ##S = -k[p_1 \ln p_1 + \dots ]##, so $$S' - S = k[p_j \ln ( \frac{p_j}{p_j + \epsilon}) + p_i \ln (\frac{p_i}{p_i - \epsilon}) - \epsilon \ln ( \frac{p_j + \epsilon}{p_i - \epsilon})]$$ The argument of the first log is < 1 so the whole term is negative. The argument of the second log is > 1 so the whole term is positive. The argument of the last log can be > 1 or < 1. So i am not sure how to make any conclusions?
     
    Last edited: Jan 20, 2015
  15. Jan 20, 2015 #14

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    You still have pj < pi
     
  16. Jan 21, 2015 #15

    CAF123

    User Avatar
    Gold Member

    Ok, so the argument of the last log can be rewritten like $$\frac{p_j}{p_i - \epsilon} + \frac{\epsilon}{p_i - \epsilon},$$ so if we are changing the ##p_i## and ##p_j## so as to only reduce their difference, then the first term is ##< 1## and the last term is <1. However I still have a term (the first one) that has argument <1 so it is difficult to make conclusions without knowing their relative difference. Thanks.
     
  17. Jan 21, 2015 #16

    mfb

    User Avatar
    2016 Award

    Staff: Mentor

    Then you'll have to find some other way to rewrite the expressions.
    It might help (did not test it) to define ##p_j = p_i + x##.

    A completely different approach: Consider ##S(p_i)## as function of one independent variable (different from what you did for the other parts), show that its second derivative exists and is negative everywhere.
     
  18. Jan 21, 2015 #17

    CAF123

    User Avatar
    Gold Member

    Ok thanks I'll think more about your suggestion. In the meantime, I wonder if TSny could perhaps tell me what he had in mind when he was directing me in the previous method? :)
     
  19. Jan 21, 2015 #18

    TSny

    User Avatar
    Homework Helper
    Gold Member

    I was thinking of letting ##\epsilon## be a small, first-order quantity and considering the change ##\delta S## when ##\delta p_j = \epsilon##. Note ##\delta p_i## is then ##-\epsilon##. (As before, I'm thinking of ##p_j## as the smaller of the two p's.)

    Once you get the result for an infinitesimal ##\epsilon##, you can easily generalize to a finite change.

    This approach is similar to mfb's suggestion of evaluating ##\frac{\partial S}{\partial p_j}## while treating ##p_i## as a function of ##p_j##.
     
  20. Jan 26, 2015 #19

    CAF123

    User Avatar
    Gold Member

    Ok so if I understand you correctly, I am using my expression for S'-S in #13 and rewriting all instances of ##p_i## with ##1-p_j - \sum_{k=1, k \neq i,j} p_k## (to reexpress ##p_i## in terms of ##p_j).## Then take the derivative wrt ##p_j##?
     
  21. Jan 26, 2015 #20

    TSny

    User Avatar
    Homework Helper
    Gold Member

    You want to show that ##\delta S > 0## when ##p_j## and ##p_i## are moved closer to each other. The ##\delta## operation is essentially the same as taking a differential since we are considering ##\epsilon## as small.

    So, consider ##\delta \left ( -k \sum_{n=1}^r p_n \ln p_n \right )##. Only two of the p's vary: pi and pj. What do you get for ##\delta \left ( p_i \ln p_i \right )## and ##\delta \left ( p_j \ln p_j \right ) ## when ##\delta p_j = \epsilon## and ##\delta p_i = - \epsilon##?
     
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook

Have something to add?
Draft saved Draft deleted



Similar Discussions: The Missing information function
Loading...