Isosceles triangle in information theory

noowutah · Jun 8, 2015

In Euclidean geometry (presumably also in non-Euclidean geometry), the part of the dissecting line that dissects the vertex angle and is inside the isosceles triangle is shorter than the legs of the isosceles triangle. Let ABC be an isosceles triangle with AB being the base. Then, for 0<\lambda<1,

d(C,\lambda{}A+(1-\lambda)B)<d(C,A)=d(C,B)

d is the Euclidean distance measure (taking a_{i} to be the coordinates of A in \mathbb{R}^{n})

d(A,B)=\sum_{i=1}^{n}\sqrt{(a_{i}-b_{i})^{2}}

I want to show that this is also true if our notion of distance is the Kullback-Leibler divergence from information theory. So, let A, B, C be points in n-dimensional space with

D_{KL}(C,A)=D_{KL}(C,B)

where

D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}

Let F be a point between A and B in the sense that

F=\lambda{}A+(1-\lambda)B,0<\lambda<1

Then I want to prove that

D_{KL}(C,F)<D_{KL}(C,A)=D_{KL}(C,B)

Two points that may be helpful are (1) the Gibbs inequality (p\ln{}p<p\ln{}q); and (2) the convexity of the logarithm (\ln(\lambda{}x+(1-\lambda)y)<\lambda\ln{}x+(1-\lambda)\ln{}y), but I haven't been able to get anywhere. I'd love some help.

wabbit · Jun 8, 2015

Actually ~~I think the opposite is true~~, i.e. by the concavity of the logarithm $$ \ln( \lambda y_i +(1-\lambda)z_i) > \lambda \ln y_i +(1-\lambda)\ln z_i $$ and using $$ D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}=\sum_{i=1}^{n}x_{i}\ln x_{i}-x_{i}\ln y_{i} $$ and similarly for ##D_{KL}(X,Z) ## and ## D_{KL}(X,\lambda Y+(1-\lambda)Z) ## you get $$ D_{KL}(X,\lambda Y+(1-\lambda)Z)<\lambda D_{KL}(X,Y)+(1-\lambda)D_{KL}(X,Z) $$

Edit : corrected, thanks @stlukits, indeed the log is concave, not convex - don't know what I was thinking.

noowutah · Jun 8, 2015

Yes, good point. The natural logarithm is actually concave -- my bad -- so

\ln(\lambda{}x+(1-\lambda)y)\geq\lambda\ln{}x+(1-\lambda)\ln{}y

which, if wabbit were right, would give us the result I need. Following wabbit, however, I only get

D_{KL}(Z,\lambda{}X+(1-\lambda)Y)=\sum_{i=1}^{n}z_{i}(\ln{}z_{i}-\ln(\lambda{}x_{i}+(1-\lambda)y_{i}))\leq\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{x_{i}^{\lambda}y_{i}^{1-\lambda}}

but that's not smaller or equal than

\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{\lambda{}x_{i}+(1-\lambda)y_{i}}=\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)

So we are close, but not quite there. Thank you, wabbit, for framing the question nicely -- is the Kullback-Leibler divergence convex if you hold the point from which you measure the divergence fixed, i.e.

D_{KL}(Z,\lambda{}X+(1-\lambda)Y)\stackrel{\mbox{?}}{\leq}\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)

wabbit · Jun 8, 2015

Thanks for the correction about concavity - other than that I don't see what's the problem, the inequality follows directly from the concavity as mentionned above.

noowutah · Jun 8, 2015

Thank you, wabbit. There are more formal proofs of the convexity of the Kullback-Leibler divergence here:

http://homes.cs.washington.edu/~anuprao/pubs/CSE533Autumn2010/lecture3.pdf

and here:

http://www.renyi.hu/~csiszar/Publications/Information_Theory_and_Statistics:_A_Tutorial.pdf

Great help! Issue solved.

Isosceles triangle in information theory

Thread 'What Exactly is Dirac’s Delta Function? - Insight'

Thread 'Fermat's Last Theorem'

Thread 'Useless continued fraction for 1'

Similar threads

Hot Threads

Insights Why Vector Spaces Explain The World: A Historical Perspective

Insights Fermat's Last Theorem

B How is it that law of sines does not work in this exercise?

B What could prove this wrong? I'm having a dispute with friends

B About a definition: What is the number of terms of a polynomial P(x)?

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective