1. Limited time only! Sign up for a free 30min personal tutor trial with Chegg Tutors
    Dismiss Notice
Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Isosceles triangle in information theory

  1. Jun 8, 2015 #1
    In Euclidean geometry (presumably also in non-Euclidean geometry), the part of the dissecting line that dissects the vertex angle and is inside the isosceles triangle is shorter than the legs of the isosceles triangle. Let ABC be an isosceles triangle with AB being the base. Then, for [itex]0<\lambda<1[/itex],

    [tex]d(C,\lambda{}A+(1-\lambda)B)<d(C,A)=d(C,B)[/tex]

    [itex]d[/itex] is the Euclidean distance measure (taking [itex]a_{i}[/itex] to be the coordinates of A in [itex]\mathbb{R}^{n}[/itex])

    [tex]d(A,B)=\sum_{i=1}^{n}\sqrt{(a_{i}-b_{i})^{2}}[/tex]

    I want to show that this is also true if our notion of distance is the Kullback-Leibler divergence from information theory. So, let A, B, C be points in n-dimensional space with

    [tex]D_{KL}(C,A)=D_{KL}(C,B)[/tex]

    where

    [tex]D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}[/tex]

    Let F be a point between A and B in the sense that

    [tex]F=\lambda{}A+(1-\lambda)B,0<\lambda<1[/tex]

    Then I want to prove that

    [tex]D_{KL}(C,F)<D_{KL}(C,A)=D_{KL}(C,B)[/tex]

    Two points that may be helpful are (1) the Gibbs inequality ([itex]p\ln{}p<p\ln{}q[/itex]); and (2) the convexity of the logarithm ([itex]\ln(\lambda{}x+(1-\lambda)y)<\lambda\ln{}x+(1-\lambda)\ln{}y[/itex]), but I haven't been able to get anywhere. I'd love some help.
     
  2. jcsd
  3. Jun 8, 2015 #2

    wabbit

    User Avatar
    Gold Member

    Actually I think the opposite is true, i.e. by the concavity of the logarithm $$ \ln( \lambda y_i +(1-\lambda)z_i) > \lambda \ln y_i +(1-\lambda)\ln z_i $$ and using $$ D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}=\sum_{i=1}^{n}x_{i}\ln x_{i}-x_{i}\ln y_{i} $$ and similarly for ##D_{KL}(X,Z) ## and ## D_{KL}(X,\lambda Y+(1-\lambda)Z) ## you get $$ D_{KL}(X,\lambda Y+(1-\lambda)Z)<\lambda D_{KL}(X,Y)+(1-\lambda)D_{KL}(X,Z) $$

    Edit : corrected, thanks @stlukits, indeed the log is concave, not convex - dunno what I was thinking.
     
    Last edited: Jun 8, 2015
  4. Jun 8, 2015 #3
    Yes, good point. The natural logarithm is actually concave -- my bad -- so

    [tex]\ln(\lambda{}x+(1-\lambda)y)\geq\lambda\ln{}x+(1-\lambda)\ln{}y[/tex]

    which, if wabbit were right, would give us the result I need. Following wabbit, however, I only get

    [tex]D_{KL}(Z,\lambda{}X+(1-\lambda)Y)=\sum_{i=1}^{n}z_{i}(\ln{}z_{i}-\ln(\lambda{}x_{i}+(1-\lambda)y_{i}))\leq\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{x_{i}^{\lambda}y_{i}^{1-\lambda}}[/tex]

    but that's not smaller or equal than

    [tex]\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{\lambda{}x_{i}+(1-\lambda)y_{i}}=\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)[/tex]

    So we are close, but not quite there. Thank you, wabbit, for framing the question nicely -- is the Kullback-Leibler divergence convex if you hold the point from which you measure the divergence fixed, i.e.

    [tex]D_{KL}(Z,\lambda{}X+(1-\lambda)Y)\stackrel{\mbox{???}}{\leq}\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)[/tex]
     
  5. Jun 8, 2015 #4

    wabbit

    User Avatar
    Gold Member

    Thanks for the correction about concavity - other than that I don't see what's the problem, the inequality follows directly from the concavity as mentionned above.
     
  6. Jun 8, 2015 #5
Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook




Similar Discussions: Isosceles triangle in information theory
Loading...