Information Geometry: Is there any there there?

  • Context: Graduate 
  • Thread starter Thread starter Frabjous
  • Start date Start date
  • Tags Tags
    Geometry Information
Click For Summary

Discussion Overview

The discussion revolves around the concept of information geometry, exploring its theoretical foundations, potential applications, and connections to other fields such as machine learning, biophysics, and quantum mechanics. Participants express varying levels of familiarity and competence regarding the topic, leading to a range of viewpoints on its significance and utility.

Discussion Character

  • Exploratory
  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • Some participants find information geometry to be an interesting blend of differential geometry and statistics, although they express uncertainty about their competence to evaluate its significance.
  • Others argue that while the theoretical aspects of information geometry are appealing, practical applications remain unclear, with some suggesting it has not produced novel results in computational neuroscience or machine learning.
  • A participant mentions John Baez's work, suggesting that information geometry could be crucial for addressing open questions in theoretical biophysics, including the nature of life.
  • Another participant references Leonard Susskind's lectures on quantum complexity and gravity, proposing that there are interesting connections between these topics and information geometry.
  • Some participants highlight the historical development of information geometry, noting its origins in the work of Japanese scientists, particularly Shun'ichi Amari.
  • A few participants share links to related papers, discussing phase transitions in high-dimensional geometry and their implications for data analysis and signal processing.

Areas of Agreement / Disagreement

Participants exhibit a mix of curiosity and skepticism regarding the practical applications of information geometry. While some express interest in its theoretical implications, others question its utility in real-world scenarios. No consensus is reached on its overall significance or effectiveness.

Contextual Notes

Participants note limitations in their understanding and the potential dependence on specific definitions and contexts. The discussion reflects a range of assumptions about the applicability of information geometry across different fields.

Frabjous
Gold Member
Messages
1,968
Reaction score
2,431
Recently came across this concept. It looks like a combination of dg and statistics. It sounds interesting, but I do not feel competent enough to make an informed decision.

For example see
https://arxiv.org/abs/1808.08271
 
Last edited:
  • Like
Likes   Reactions: Jarvis323
Physics news on Phys.org
caz said:
Recently came across this concept. It looks like a combination of dg and statistics. It sounds interesting, but I do not feel competent enough to make an informed decision.

For example see
https://arxiv.org/abs/1808.08271
It looks legit to me. The author claims to be with Sony (Sony Computer Science Laboratoties[sic]). It does appear to be a blend of differential geometry and statistics, as you said. Later in the paper the author mentions applications of information geometry as machine learning, signal processing, and game theory.
 
  • Like
Likes   Reactions: Jarvis323, Frabjous and fresh_42
Mark44 said:
It looks legit to me.
I have read the Wikipedia entry on it and its link to statistical manifolds. It sounded interesting from a theoretical point of view. I am curious whether there are some actual applications or at least shortcuts which pure stochastic couldn't provide.
 
  • Like
Likes   Reactions: Jarvis323 and Frabjous
fresh_42 said:
I have read the Wikipedia entry on it and its link to statistical manifolds. It sounded interesting from a theoretical point of view. I am curious whether there are some actual applications or at least shortcuts which pure stochastic couldn't provide.

The intent of my question was more along these lines.
 
John Baez has written extensively on information geometry. This branch of mathematics seems to be an essential tool for solving many central open issues in theoretical biophysics, including answering the central question: what is life?
 
  • Like
Likes   Reactions: Jarvis323 and Frabjous
Auto-Didact said:
John Baez has written extensively on information geometry. This branch of mathematics seems to be an essential tool for solving many central open issues in theoretical biophysics, including answering the central question: what is life?

Thanks, I’ll check him out.
 
This seems to be related to a series of lectures given by Leonard Susskind at the Institute of Advanced Studies.

The following is the first of 3 lectures.

The subject of these lectures is quantum complexity and gravity. Susskind creates these lectures after reading several research papers on complexity theory. In these lectures, he describes an experimental approach for unifying quantum mechanics with gravity. Complexity and gravity are deeply related.

The following are several ideas from the first of 3 lectures that I feel are interesting:

Quantifying complexity

Start with K qubits, we have a space of states 2k dimensional.

What is space of states of the k-qubit system? The manifold of normalizable states modding out to U(1) phase is some CP(N) where N = 2k.

CP(N) is a sphere, regularized by epsilon balls, balls with radius ε. Each ε ball represents a discrete state. The # of epsilon balls is the number of states, #S, where #S grows rapidly with K and weakly dependent on 1/ε.


Distance between states


Distance between states A, B = dAB = arcos l<A,B>l is the standard inner product metric.

Max d = π/2, the maximum distance between any two states. This metric does not show the qualitative distance between states.


Relative complexity


A metric that quantifies the # of small steps is the relative complexity.

We use a quantum circuit for this description. The collection of gates are unitary operators between states A and B. There is a set of universal collection of gates to go from anyone state to another. The minimum number n of such gates is C(A,B), or the relative complexity.

The relative complecity is a metrical notion of distance in CP(N). C(A,B) is a geodesic, the shortest path between lA> and lB>.

The conclusion is the maximally entangled states will have an inner product which is at most π/2. The unitary space is small in the inner product metriic. The relative complexity can be large.

The properties of relative complexity are analogous to the classical notion of distance. 1. C ≥ 0. # gates ≥ 0

2. C(u,v) = 0
iff u=v

3. C(u,v) = C(v,u). C
is a symmetric function on u and v

4. C(u,v) ≤ C(u,w) + C(w,v). t
riangle inequalityC is Right invariant, but not left invariant. Complexity has a geometry that is right invariant on the space SU(2k).
 
  • Like
Likes   Reactions: Frabjous and Jarvis323
Auto-Didact said:
John Baez has written extensively on information geometry. This branch of mathematics seems to be an essential tool for solving many central open issues in theoretical biophysics, including answering the central question: what is life?

I would point out that most people (at least in computational neuroscience and machine learning) view information geometry as completely useless in practice. It's a very beautiful formalisation of information theory and Fisher information etc. in terms of differential geometry, but one that has produced no practical results that weren't already known. I'd be very happy to be proven wrong on that though.

As a side note, information geometry was initially developed by a number of Japanese scientists, especially this guy https://en.wikipedia.org/wiki/Shun'ichi_Amari. For a long time it was only published in japanese language books and articles. And by the way, Amari was a computational neuroscientist, so it kinda developed out of that field in a way.
 
  • Like
Likes   Reactions: jbergman and Frabjous
This paper may not be all that related, but you might find it interesting.

OBSERVED UNIVERSALITY OF PHASE TRANSITIONS IN HIGH-DIMENSIONAL GEOMETRY, WITH IMPLICATIONS FOR MODERN DATA ANALYSIS AND SIGNAL PROCESSING

We review connections between phase transitions in high-dimensional combinatorial geometry and phase transitions occurring in modern highdimensional data analysis and signal processing. In data analysis, such transitions arise as abrupt breakdown of linear model selection, robust data fitting or compressed sensing reconstructions, when the complexity of the model or the number of outliers increases beyond a threshold. In combinatorial geometry these transitions appear as abrupt changes in the properties of face counts of convex polytopes when the dimensions are varied. The thresholds in these very different problems appear in the same critical locations after appropriate calibrationof variables.
https://people.maths.ox.ac.uk/tanner/papers/DoTa_Universality.pdf
 
Last edited:
  • Like
  • Informative
Likes   Reactions: atyy, madness and Frabjous
  • #10
Jarvis323 said:
This paper may not be all that related, but you might find it interesting.https://people.maths.ox.ac.uk/tanner/papers/DoTa_Universality.pdf

This reminds me of another result that I found interesting, buried in the supplemental material of this paper (https://www.nature.com/articles/s41586-020-2130-2) (equation 23 in supplemental). What they show is that the eigenvectors of the sample covariance matrix undergo a phase transition as the number of samples and number of dimensions are varied.
 
  • Informative
  • Like
Likes   Reactions: atyy and Jarvis323
  • #11
caz said:
Recently came across this concept. It looks like a combination of dg and statistics. It sounds interesting, but I do not feel competent enough to make an informed decision.

For example see
https://arxiv.org/abs/1808.08271

The book 'Information Geometry' of Ay et al. expands on some applications (see image attached).

There is also a fascinating quantum version of information geometry. As a teaser, consider this. The state space of a quantum system with finitely many states is \mathbb{C}P^n = \mathbb{S}^{2n-1}/\mathbb{S}^1: the vectors of unit length (total probability =1) in the Hilbert space modulo \mathbb{S}^1 (states that differ only by a phase factor are physically indistinguishable). On \mathbb{C}P^n there is a natural metric that comes from the round metric on \mathbb{S}^{2n-1}, the so-called Fubini-Study metric. Geometrically, \mathbb{C}P^n is relatively simple to describe; it is the product of the standard n-dimensional probability simplex \Delta_n with a n-torus with the edges identified in some complicated manner. But the fact remains that on an open dense subset, \mathbb{C}P^n is just the product \mathrm{int}(\Delta_n)\times \mathbb{T}^n and the Fubini-Study metric with respect to this factorization is the product of the Fisher metric (!) on \Delta_n times a metric on \mathbb{T}^n that varies with where we are on \Delta_n.

Source: 'Geometry of quantum states: An Introduction to Quantum Entanglement'. Bengtsson et al. (2017). Also Gromov's meditations on Entropy: https://www.ihes.fr/~gromov/expository/579/
 

Attachments

  • ch6.png
    ch6.png
    76.3 KB · Views: 360
  • Like
Likes   Reactions: WWGD and Frabjous
  • #12
caz said:
Recently came across this concept. It looks like a combination of dg and statistics. It sounds interesting, but I do not feel competent enough to make an informed decision.

For example see
https://arxiv.org/abs/1808.08271

In machine learning they try to minimize functions on parameter space. They do this by the method of gradient descent. Information geometry comes in and says

"You guys are doing gradient descent but you're wrongly assuming that the parameter space is flat. It is not, it carries a natural nonplanar shape given by (the pullback of) the Fisher metric and when you take this into account your gradient descent method works better."

Unfortunately computing the Fisher metric is computationally too expensive for the large parameter spaces involved in Neural networks so they largely ignore it lol.

Source:
https://towardsdatascience.com/natural-gradient-ce454b3dcdfa
https://www.mitpressjournals.org/doi/10.1162/089976698300017746?mobileUi=0&
 
  • Like
Likes   Reactions: WWGD and Frabjous
  • #13
caz said:
Recently came across this concept. It looks like a combination of dg and statistics. It sounds interesting, but I do not feel competent enough to make an informed decision.

For example see
https://arxiv.org/abs/1808.08271

The book 'Information Geometry and Its Applications' by Amari (the main pioneer of the subject) has a 130 pages long section on applications.There is a freely downloadable Table of Content here:
https://www.springer.com/gp/book/9784431559771
 
  • Like
Likes   Reactions: WWGD and Frabjous
  • #14
madness said:
I would point out that most people (at least in computational neuroscience and machine learning) view information geometry as completely useless in practice. It's a very beautiful formalisation of information theory and Fisher information etc. in terms of differential geometry, but one that has produced no practical results that weren't already known. I'd be very happy to be proven wrong on that though.

As a side note, information geometry was initially developed by a number of Japanese scientists, especially this guy https://en.wikipedia.org/wiki/Shun'ichi_Amari. For a long time it was only published in japanese language books and articles. And by the way, Amari was a computational neuroscientist, so it kinda developed out of that field in a way.
I am no expert on the literature here but that is what I've also heard from Differential Geometers. In other words, it mostly involves "dressing up" things with Differential Geometry without really providing any new results or insights.
 
  • Like
Likes   Reactions: Frabjous and madness

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 10 ·
Replies
10
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
Replies
1
Views
2K
  • · Replies 152 ·
6
Replies
152
Views
10K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 31 ·
2
Replies
31
Views
4K
  • · Replies 5 ·
Replies
5
Views
4K