Fisher matrix for multivariate normal distribution

Click For Summary

Discussion Overview

The discussion centers on the derivation of the Fisher information matrix (FIM) for multivariate normal distributions, particularly focusing on the simplification of the matrix and the conditions under which it holds. Participants explore both specific cases and the general scenario where the covariance matrix may depend on parameters.

Discussion Character

  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant references a formula for the Fisher information matrix and seeks derivation assistance, noting difficulties in finding references.
  • Another participant provides a derivation using matrix derivatives, indicating that the formula holds under the condition that the covariance matrix does not depend on the parameter being estimated.
  • A different participant suggests that a general proof exists in a specific academic paper, indicating that the topic has been explored in the literature.
  • There is a mention of issues with LaTeX code formatting, with one participant pointing out a potential error in the syntax used.
  • One participant shares an additional reference that offers a direct derivation of the Fisher information matrix, suggesting it may be easier to interpret than previous sources mentioned.

Areas of Agreement / Disagreement

Participants express varying levels of understanding and reference different sources, indicating that while some aspects of the derivation may be agreed upon, there is no consensus on the general case or the best approach to derive the Fisher information matrix when the covariance matrix depends on parameters.

Contextual Notes

Participants note limitations in their derivations, particularly concerning the dependency of the covariance matrix on the parameters, which remains unresolved in the discussion.

hdb
Messages
3
Reaction score
0
The fisher information matrix for multivariate normal distribution is said at many places to be simplified as:
[tex]\mathcal{I}_{m,n} = \frac{\partial \mu^\mathrm{T}}{\partial \theta_m} \Sigma^{-1} \frac{\partial \mu}{\partial \theta_n}.\[/tex]
even on
http://en.wikipedia.org/wiki/Fisher_information#Multivariate_normal_distribution"
I am trying to come up with the derivation, but no luck so far. Does anyone have any ideas / hints / references, how to do this?

Thank you
 
Last edited by a moderator:
Physics news on Phys.org
Using matrix derivatives one has [tex]D_x(x^T A x) = x^T(A+A^T)[/tex] from which it follows that [tex]D_{\theta} \log p(z ; \mu(\theta) , \Sigma) = (z-\mu(\theta))^T \Sigma^{-1} D_{\theta} \mu(\theta)[/tex] For simplicity let's write [tex]D_{\theta} \mu(\theta) = H[/tex] The FIM is then found as [tex]J = E[ ( D_{\theta} \log p(z ; \mu(\theta) , \Sigma))^T D_{\theta} \log p(z ; \mu(\theta) , \Sigma)] = E[ H^T R^{-1} (z - \mu(\theta))^T (z - \mu(\theta)) R^{-1} H] = H^T R^{-1} R R^{-1} H = H^T R^{-1} H [\tex] which is equivalent to the given formula. Notice that this formula only is valid as long as [tex]\Sigma [\tex] does not depend on [tex]\theta [\tex]. I'm still struggling to find a derivation of the more general case where also [tex]\Sigma [\tex] depends on [tex]\theta [\tex].<br /> <br /> For some reason my tex code is not correctly parsed. I cannot understand why.[/tex][/tex][/tex][/tex][/tex]
 
Actually the general proof can apparently be found in Porat & Friedlander: Computation of the Exact Information Matrix of Gaussian Time Series with Stationary Random Components, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol ASSP-34, No. 1, Feb. 1986.
 
edmundfo said:
R^{-1} H] = H^T R^{-1} R R^{-1} H = H^T R^{-1} H [\tex]

For some reason my tex code is not correctly parsed. I cannot understand why.

For one thing, you're using the back slash [\tex] instead of the forward slash [/tex] at the end of your code.
 
edmundfo said:
Actually the general proof can apparently be found in Porat & Friedlander: Computation of the Exact Information Matrix of Gaussian Time Series with Stationary Random Components, IEEE Transactions on Acoustics, Speech and Signal Processing, Vol ASSP-34, No. 1, Feb. 1986.
Thank you for the answers, in between I have found an another reference, which is a direct derivation of the same result, for me this one seems to be easier to interpret:

Klein, A., and H. Neudecker. “A direct derivation of the exact Fisher information matrix of Gaussian vector state space models.” Linear Algebra and its Applications 321, no. 1-3
 

Similar threads

  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
8K