Graduate Comparing Kullback-Leibler divergence values

nigels · Apr 17, 2017

I’m currently evaluating the "realism" of two survival models in R by comparing the respective Kullback-Leibler divergence between their simulated survival time dataset (`dat.s1` and `dat.s2`) and a “true”, observed survival time dataset (`dat.obs`). Initially, directed KLD functions show that `dat.s2` is a better match to the observation:

> library(LaplacesDemon)
> KLD(dat.s1, dat.obs)$sum.KLD.py.px
[1] 1.17196
> KLD(dat.s2, dat.obs)$sum.KLD.py.px
[1] 0.8827712

However, when I visualize the densities of all three datasets, it seems quite clear that `dat.s1` (green) better alignes with the observation:

> plot(density(dat.obs), lwd=3, ylim=c(0,0.9))
> lines(density(dat.s1), col='green')
> lines(density(dat.s2), col='purple')

What is the cause behind this discrepancy? Am I applying KLD incorrectly due to some conceptual misunderstanding?

Number Nine · Apr 17, 2017

Keep in mind that the KL-divergence is non-commutative, and different "orders" correspond to different objective functions (and different research questions). The way you're fitting it (that is, KL(Q||P), where Q is being fit to P) is trying to match regions of high density, and it does seem to be the case that the highest probability mass in your "worse fitting" model coincides with the highest probability mass in your target better than does the "better fitting" model. There's a fairly good discussion related to the topic here:

https://stats.stackexchange.com/questions/188903/intuition-on-the-kullback-leibler-kl-divergence
and
http://timvieira.github.io/blog/post/2014/10/06/kl-divergence-as-an-objective-function/

The other direction may actually be closer to what you're interested in:
KLD(dat.obs, dat.s1)$sum.KLD.py.px
KLD(dat.obs, dat.s2)$sum.KLD.py.px

Graduate Comparing Kullback-Leibler divergence values

Attachments

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad A variant of the Monty Hall problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers