Riemannian Fisher-Rao metric and orthogonal parameter space

In summary: The upper half plane can be identified to the set of positive-definite matrices. Thus, via this identification, one can view the normal distribution as a sub-Riemannian manifold of the manifold of positive-definite matrices. This has far-reaching generalization to other distributions.In summary, a statistical model is a map between an open subset of a manifold and the set of probability measures on a finite set. The Fisher metric, which is defined using the Fisher information matrix, is a Riemannian metric on the tangent space of the set of nowhere vanishing signed measures. This metric can be diagonalized at any point by choosing appropriate coordinate system. However, not all metrics can be diagonalized, and the Fisher metric has a special property
  • #1
Vini
3
1
TL;DR Summary
A silly question on off-diagonal elements of the Fisher-Rao metric
Let ## \mathcal{S} ## be a family of probability distributions ## \mathcal{P} ## of random variable ## \beta ## which is smoothly parametrized by a finite number of real parameters, i.e.,
## \mathcal{S}=\left\{\mathcal{P}_{\theta}=w(\beta;\theta);\theta \in \mathbb{R}^{n}, \theta=(\theta^{i})\right\} ## . The statistical model ## \mathcal{S} ## carries the structure of smooth Riemannian manifold ## \mathcal{M} ## , with respect to which ## \theta=(\theta^{i}) ## play the role of coordinates of a point ## \mathcal{P}_{\theta}\in \mathcal{S} ## , and whose metric is defined by the Fisher's information matrix ## \mbox{H}=(g_{ij}(\theta)) ## , where the coefficients of this matrix, which yields a positive definite metric, are calculated as the expectation of a product involving partial derivatives of the logarithm of the probability density function's (PDF)

## g_{ij}(\theta)=\int^{+\infty}_{-\infty} \displaystyle \frac{\partial^{2}ln \left( w(\beta;\theta)\right)}{\partial \theta^{i} \partial \theta^{j}}w(\beta;\theta)d\beta ## .How do we neglect the off-diagonal terms ## g_{12}= g_{21} ## ?

In other words, is there a mathematical argument, wherein it is possible to consider ## g_{12}=g_{21}=0 ## ?
 
Physics news on Phys.org
  • #2
Hello Vini,

First let me reformulate the general setup that you explained in the way that I understand it.

I will focus on the case where [itex]\Omega[/itex] is finite because for the general case, although the idea is the same, the technical details are infinitely more subtle. A statistical model for some random variable on [itex]\Omega[/itex] is a map [itex]p[/itex] from some open subset [itex]U[/itex] of [itex]\mathbb{R}^n[/itex] (or more generally, some manifold!) into the set [itex]\mathcal{P}=\mathcal{P}(\Omega)[/itex] of all probability measures on [itex]\Omega[/itex]. The set [itex]\mathcal{P}[/itex] itself sits inside the set [itex]\mathcal{S}[/itex] of all signed measures on [itex]\Omega[/itex] which (contrary to [itex]\mathcal{P}[/itex]) is a vector space. Thus, as a manifold, its tangent space at any point is naturally identified to [itex]\mathcal{S}[/itex] itself. Now, let us restrict our attention to the open submanifold [itex]\mathcal{S}^{\circ}[/itex] of all the nowhere vanishing signed measures. Here, there is a canonical family of covariant [itex]k[/itex]-tensor fields for every integer [itex]k[/itex] given by integration (of the product of the Radon-Nykodym derivatives):
$$
T_{\mu}(\mathcal{S}^{\circ})\times \ldots T_{\mu}(\mathcal{S}^{\circ})\rightarrow \mathbb{R}: (\sigma_1,\ldots \sigma_k)\mapsto \int_{\Omega}\frac{d\sigma_1}{d\mu}\cdots \frac{d\sigma_k}{d\mu} d\mu.
$$
In particular, for [itex]k=2[/itex], this is a Riemannian metric called the Fisher metric and when you pull it back through [itex]p[/itex] you get, up to an integration by part, the Fisher metric on [itex]U[/itex] that you wrote down. The only source I know of to properly learn about this in the [itex]|\Omega|=\infty[/itex] case is the book Information Geometry (2017) by Ay, Jost, , Schwachhöfer.

So back to your question, which actually has nothing to do with the specifics of how the Fisher metric arises: around every point [itex]x[/itex] of a Riemannian manifold [itex](U,g)[/itex] there is a coordinate system for which the metric is diagonal at [itex]x[/itex]. This is easy to see when you know that the exponential map is a local diffeomorphism: just pick an orthogonal basis [itex](V_1,\ldots V_m)[/itex] of [itex]T_xU[/itex] and define coordinates [itex]u^i[/itex] by setting [itex]u^i\mapsto \mathrm{exp}_x(u^iV_i)[/itex].

It is certainly not always true that there exists coordinates for which the metric is diagonal. Such a metric is called conformally flat (or simply flat if the diagonal elements are actually constant). As far as information geometry is concerned, one of the more endearing result is that, for the most important probability distribution in statistics (the normal distribution), the Fisher metric is (up to a scaling factor), the hyperbolic metric on the upper half plane.
 
  • Like
Likes Vini

1. What is the Riemannian Fisher-Rao metric?

The Riemannian Fisher-Rao metric is a mathematical tool used to measure the distance between two probability distributions on a manifold. It is derived from the Fisher information metric, which measures the amount of information that a random variable contains about an unknown parameter.

2. How is the Riemannian Fisher-Rao metric related to orthogonal parameter space?

The Riemannian Fisher-Rao metric is closely related to orthogonal parameter space because it is used to measure the distance between probability distributions defined on a manifold of orthogonal parameters. This allows for a more efficient and accurate analysis of the relationships between different parameters.

3. What is the significance of the Riemannian Fisher-Rao metric in statistics?

The Riemannian Fisher-Rao metric plays a crucial role in statistics as it provides a way to measure the distance between probability distributions, which is essential in many statistical analyses such as hypothesis testing, parameter estimation, and model selection.

4. How is the Riemannian Fisher-Rao metric calculated?

The Riemannian Fisher-Rao metric is calculated using the Fisher information matrix, which is a matrix of second-order derivatives of the log-likelihood function. This matrix is then used to define a metric tensor, which is used to calculate the distance between probability distributions on a manifold.

5. What are some applications of the Riemannian Fisher-Rao metric?

The Riemannian Fisher-Rao metric has many applications in various fields, including machine learning, image processing, and bioinformatics. It is also used in statistical analyses such as principal component analysis, multidimensional scaling, and canonical correlation analysis.

Similar threads

Replies
13
Views
2K
  • Differential Geometry
Replies
14
Views
3K
  • Math Proof Training and Practice
2
Replies
61
Views
9K
  • Math Proof Training and Practice
2
Replies
42
Views
6K
  • Special and General Relativity
2
Replies
43
Views
4K
  • Math Proof Training and Practice
6
Replies
175
Views
20K
  • Linear and Abstract Algebra
Replies
10
Views
11K
  • Math Proof Training and Practice
3
Replies
93
Views
10K
  • MATLAB, Maple, Mathematica, LaTeX
Replies
3
Views
275
  • Poll
  • Science and Math Textbooks
Replies
15
Views
19K
Back
Top