Isosceles triangle in information theory

noowutah
Messages
56
Reaction score
3
In Euclidean geometry (presumably also in non-Euclidean geometry), the part of the dissecting line that dissects the vertex angle and is inside the isosceles triangle is shorter than the legs of the isosceles triangle. Let ABC be an isosceles triangle with AB being the base. Then, for 0<\lambda<1,

d(C,\lambda{}A+(1-\lambda)B)<d(C,A)=d(C,B)

d is the Euclidean distance measure (taking a_{i} to be the coordinates of A in \mathbb{R}^{n})

d(A,B)=\sum_{i=1}^{n}\sqrt{(a_{i}-b_{i})^{2}}

I want to show that this is also true if our notion of distance is the Kullback-Leibler divergence from information theory. So, let A, B, C be points in n-dimensional space with

D_{KL}(C,A)=D_{KL}(C,B)

where

D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}

Let F be a point between A and B in the sense that

F=\lambda{}A+(1-\lambda)B,0<\lambda<1

Then I want to prove that

D_{KL}(C,F)<D_{KL}(C,A)=D_{KL}(C,B)

Two points that may be helpful are (1) the Gibbs inequality (p\ln{}p<p\ln{}q); and (2) the convexity of the logarithm (\ln(\lambda{}x+(1-\lambda)y)<\lambda\ln{}x+(1-\lambda)\ln{}y), but I haven't been able to get anywhere. I'd love some help.
 
Mathematics news on Phys.org
Actually I think the opposite is true, i.e. by the concavity of the logarithm $$ \ln( \lambda y_i +(1-\lambda)z_i) > \lambda \ln y_i +(1-\lambda)\ln z_i $$ and using $$ D_{KL}(X,Y)=\sum_{i=1}^{n}x_{i}\ln\frac{x_{i}}{y_{i}}=\sum_{i=1}^{n}x_{i}\ln x_{i}-x_{i}\ln y_{i} $$ and similarly for ##D_{KL}(X,Z) ## and ## D_{KL}(X,\lambda Y+(1-\lambda)Z) ## you get $$ D_{KL}(X,\lambda Y+(1-\lambda)Z)<\lambda D_{KL}(X,Y)+(1-\lambda)D_{KL}(X,Z) $$

Edit : corrected, thanks @stlukits, indeed the log is concave, not convex - don't know what I was thinking.
 
Last edited:
  • Like
Likes noowutah
Yes, good point. The natural logarithm is actually concave -- my bad -- so

\ln(\lambda{}x+(1-\lambda)y)\geq\lambda\ln{}x+(1-\lambda)\ln{}y

which, if wabbit were right, would give us the result I need. Following wabbit, however, I only get

D_{KL}(Z,\lambda{}X+(1-\lambda)Y)=\sum_{i=1}^{n}z_{i}(\ln{}z_{i}-\ln(\lambda{}x_{i}+(1-\lambda)y_{i}))\leq\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{x_{i}^{\lambda}y_{i}^{1-\lambda}}

but that's not smaller or equal than

\sum_{i=1}^{n}z_{i}\ln\frac{z_{i}}{\lambda{}x_{i}+(1-\lambda)y_{i}}=\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)

So we are close, but not quite there. Thank you, wabbit, for framing the question nicely -- is the Kullback-Leibler divergence convex if you hold the point from which you measure the divergence fixed, i.e.

D_{KL}(Z,\lambda{}X+(1-\lambda)Y)\stackrel{\mbox{?}}{\leq}\lambda{}D_{KL}(Z,X)+(1-\lambda)D_{KL}(Z,Y)
 
Thanks for the correction about concavity - other than that I don't see what's the problem, the inequality follows directly from the concavity as mentionned above.
 
  • Like
Likes noowutah
Insights auto threads is broken atm, so I'm manually creating these for new Insight articles. In Dirac’s Principles of Quantum Mechanics published in 1930 he introduced a “convenient notation” he referred to as a “delta function” which he treated as a continuum analog to the discrete Kronecker delta. The Kronecker delta is simply the indexed components of the identity operator in matrix algebra Source: https://www.physicsforums.com/insights/what-exactly-is-diracs-delta-function/ by...
Fermat's Last Theorem has long been one of the most famous mathematical problems, and is now one of the most famous theorems. It simply states that the equation $$ a^n+b^n=c^n $$ has no solutions with positive integers if ##n>2.## It was named after Pierre de Fermat (1607-1665). The problem itself stems from the book Arithmetica by Diophantus of Alexandria. It gained popularity because Fermat noted in his copy "Cubum autem in duos cubos, aut quadratoquadratum in duos quadratoquadratos, et...
I'm interested to know whether the equation $$1 = 2 - \frac{1}{2 - \frac{1}{2 - \cdots}}$$ is true or not. It can be shown easily that if the continued fraction converges, it cannot converge to anything else than 1. It seems that if the continued fraction converges, the convergence is very slow. The apparent slowness of the convergence makes it difficult to estimate the presence of true convergence numerically. At the moment I don't know whether this converges or not.

Similar threads

Replies
1
Views
997
Replies
3
Views
1K
Replies
2
Views
2K
Replies
7
Views
3K
Replies
1
Views
1K
Replies
3
Views
1K
Replies
2
Views
5K
Back
Top