Relation with Hessian and Log-likelihood

Click For Summary

Discussion Overview

The discussion revolves around the relationship between the Hessian matrix and the log-likelihood function in statistical estimation. Participants explore the mathematical derivation of an equation relating the expectation of the product of first derivatives of the log-likelihood to the expectation of the negative second derivative. The scope includes theoretical aspects of likelihood functions and their derivatives.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant presents an equation involving the expectation of the product of first derivatives of the log-likelihood and the negative second derivative, seeking to demonstrate its validity.
  • There is a calculation involving the second derivative of the log-likelihood, leading to an expression that participants are trying to equate with another derived expression.
  • Another participant questions the integration over parameters without a prior joint distribution, emphasizing the need for clarity on the integration process with respect to the parameters.
  • A later reply suggests that for different indices in the sums, the corresponding random variables may be independent, raising the question of whether their expected values are zero.
  • Participants discuss the implications of integrating over the parameters versus the observed values, noting potential errors in earlier calculations.

Areas of Agreement / Disagreement

Participants express differing views on the correct approach to integrating over parameters and the assumptions required for the independence of certain variables. The discussion remains unresolved regarding the validity of the proposed equations and the integration methods.

Contextual Notes

There are limitations related to the assumptions made about the joint distribution of parameters and the independence of variables, which remain unresolved in the discussion.

fab13
Messages
300
Reaction score
7
TL;DR
I would like to get a rigorous demonstration of relation (1) given at the beginning of my post, implyint the Hessian and the general exporession of log-Likelihood.
I would like to demonstrate the equation (1) below in the general form of the Log-likelihood :

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

with the \log of Likelihood \mathcal{L} defined by \mathcal{L} = \log\bigg(\Pi_{i} f(x_{i})\bigg) with x_{i} all experimental/observed values.

For the instant, if I start from the second derivative (left member of (1)), I can get :

##\dfrac{\partial \mathcal{L}}{\partial \theta_{i}} = \dfrac{\partial \log\big(\Pi_{k} f(x_{k})\big)}{\partial \theta_{i}} = \dfrac{\big(\partial \sum_{k} \log\,f(x_{k})\big)}{\partial \theta_{i}}
=\sum_{k} \dfrac{1}{f(x_{k})} \dfrac{\partial f(x_{k})}{\partial \theta_{i}}##

Now I have to compute : ##\dfrac{\partial^{2} \mathcal{L}}{\partial \theta_i \partial \theta_j}=\dfrac{\partial}{\partial \theta_j} \left(\sum_{k}\dfrac{1}{f(x_{k})}\,\dfrac{\partial f(x_{k})}{\partial \theta_{i}} \right)##
##= -\sum_{k} \bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{ \partial \theta_i \partial \theta_j}\bigg)##
##=-\sum_{k}\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}}
\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}+
\dfrac{1}{f(x_{k})}
\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)##

As we compute an expectation on both sides, the second term is vanishing to zero under regularity conditions, i.e :##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}
\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

Second term can be expressed as :

##\int\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\text{d}x_k =\dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\int f(x_{k})\text{d}x_k=0##

since \int f(x_{k})\,\text{d}x_k = 1

Finally, I get the relation :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\,\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

##=\int \sum_{k}\,f(x_k) \bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k\quad\quad(3)##

But I don't know to make equation (3) equal to :

##\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k
##

##=\int \sum_{k}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\sum_{l}\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k##

##=\int \sum_k f(x_k) \bigg(\dfrac{\partial \log(\Pi_{k}f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(\Pi_{l}f(x_{l})}{\partial \theta_{j}}\bigg)\,\text{d}x_k\quad\quad(4)##
##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

I just want to prove the equality between (3) and (4) : where is my error ?

IMPORTANT UPDATE : I realized that I made an error for the calculation of the expectation into equation (2), when I write :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}},
\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

Indeed, I have not to integrate on the parameter \text{d}x_{k} but rather on the variables (\theta_i, \theta_j).

But if I do this, to compute the expectation, it seems that I need the joint distribution (f(x_k) = f(x_k, \theta_i, \theta_j).

So I guess we can rewrite this joint distribution like if \theta_i and \theta_j were independent, i.e :

##f(x_k, \theta_i, \theta_j)= f_1(x_k, \theta_i)\quad f_2(x_k, \theta_j)##


This way, I could have :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\int \sum_{k} f_1(x_k,\theta_i)\quad f_2(x_k,\theta_j)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f_1(x_{k})}{\partial \theta_{j}}\dfrac{\partial f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i\theta_j##

instead of :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

So finally, I could obtain from equation (2) :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int \int f\sum_{k} f_1(x_k,\theta_i,\theta_j)\bigg(\dfrac{\partial \log(f_2(x_{k})}{\partial \theta_{i}},
\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j##

##=\int \int f_1(x_k,\theta_i) f_2(x_l,\theta_j) \bigg(\dfrac{\partial \sum_{k} \log(f_1(x_{k})}{\partial \theta_{i}},
\dfrac{\partial \sum_l \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j\quad(5)##

##= \int\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j ##

##\int\int f_1(x_k, \theta_i, \theta_j) \bigg(\dfrac{\partial \log(\Pi_k f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(\Pi_l f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j##

##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

But I have difficulties to the step implying the 2 sums into equation(5): \sum_k and \sum_l that I introduce above without justification ? Moreover, are my calculations of expectation are correct (I mean integrate over \theta_i and \theta_j ?

If somecone could help me, this would be nice.

Regards
 
Physics news on Phys.org
fab13 said:
Indeed, I have not to integrate on the parameter \text{d}x_{k} but rather on the variables (\theta_i, \theta_j).
How can we integrate with respect to the ##\theta_i## without having a prior joint distribution for them?

The notes: https://mervyn.public.iastate.edu/stat580/Notes/s09mle.pdf indicate that the expectation is taken with respect to the variables representing the observations.
 
fab13 said:


I would like to demonstrate the equation (1) below in the general form of the Log-likelihood :

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

with the \log of Likelihood \mathcal{L} defined by \mathcal{L} = \log\bigg(\Pi_{i} f(x_{i})\bigg) with x_{i} all experimental/observed values.

For the instant, if I start from the second derivative (left member of (1)), I can get :

You mean "the right member" of 1

But I don't know to make equation (3) equal to :

##\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k
##

One thought is that for ##k \ne l##, the random variables ##\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}} ## and ##\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j} }## are independent. So the expected value of their product is the product of their expected values. Is each expected value equal to zero?
 
@Stephen Tashi Yes, sorry, I meant the "right member" when I talk about the second derivative.

Don't forget that I have added an important UPDATE, you can see it since I have to integrate over ##\theta_i## and ##\theta_j## and not on ##x_k##.

Have you got a clue/track/suggestion ? Regards
 
fab13 said:
Don't forget that I have added an important UPDATE, you can see it since I have to integrate over ##\theta_i## and ##\theta_j## and not on ##x_k##.
I'm curious why you think eq. 1 is correct when you interpret it as asking for taking the expectation with respect to ##\theta_i, \theta_j##.
 
@Stephen Tashi

Do you mean that I made confusions between (##\theta,\theta'##) and (##\theta_i,\theta_j##) ? or maybe that equation (1) is false ?

If this is the case, which track do you suggest to conclude about this relation (1) ? As I said above, I don't know how to justify the passing from step (5) to the next step : there is something wrong in my attempt of demo and I don't know where it comes from.

Any help would be nice, I begin to despair ... Regards
 
fab13 said:
@Stephen Tashi

Do you mean that I made confusions between (##\theta,\theta'##) and (##\theta_i,\theta_j##) ? or maybe that equation (1) is false ?

I mean that equations like eq. 1 that I've seen (for the Fisher Information Matrix) say that the expectation is taken with respect to the observations ##x_i##. The expectations are not taken with respect to the parameters ##\theta_i##. Where did you see eq. 1?
fab13 said:
##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

? Is that notation equivalent to:
##E\Big[\frac{\partial \mathcal{L}}{\partial \theta_r} \frac{\partial \mathcal{L}}{\partial \theta_s}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta_r \partial \theta_s}\Big]\quad(1)##

If so, I doubt eq. 1 is always true when expectations are taken with respect to ##\theta_r, \theta_s##.

But I don't know to make equation (3) equal to :

##\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k
##

I don't understand your notation for taking the expected value of a function. The expected value of ##w(x_1,x_2,...,x_n)## with respect to the joint density ##p(x_1,x_2,...x_n) = \Pi_{k} f(x_k) ## should be a multiple integral: ##\int \int ...\int p(x_1,x_2,...x_n) w(x_1,x_2,...x_n) dx_1 dx_2 ...dx_n ##

How did you get an expression involving only ##dx_k##?

It seems to me that the terms involved have a pattern like ##\int \int ...\int f(x_1) f(x_2) .. f(x_n) w(x_k) h(x_j) dx_1 dx_2 ...dx_n = ##
## \int \int f(x_k) w(x_k) f(x_j) h(x_j) dx_k dx_j = (\int f(x_k) w(x_k) dx_k) ( \int f(x_j) h(x_j) dx_j)##
 
@Stephen Tashi

Thanks for your answer. So from I understand, the quantity ##w(x_k)## would correspond to :

##w(x_{k,i})=\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)##

and if I assume that : ##f(x_1,x_2,.. x_n) = f(x_1) f(x_2) .. f(x_n)##, I get for example :

##E[w_i] = \int \int ...\int f(x_1) f(x_2) .. f(x_n) w(x_{k,i}) dx_1 dx_2 ...dx_n##

which can be applied by adding a second quantity ##h_{l,j}##.

However, I have 2 last requests : in my first post, I have calculated that

##\dfrac{\partial^{2} \mathcal{L}}{\partial \theta_i \partial \theta_j}=-\sum_{k}\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}} \dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}+ \dfrac{1}{f(x_{k})} \dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\quad(6)##

1) If I follow your reasoning, I should write for the first term of ##(6)##:

##E[\dfrac{\partial^{2} \mathcal{L}}{\partial \theta_i \partial \theta_j}] = -\int \int ...\int f(x_1) f(x_2) .. f(x_n) \sum_{k}\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}} \dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}\,\bigg)\,dx_1 dx_2 ...dx_n##

I don't know how to deal with the summation ##\sum_{k}## on ##
\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}}\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}\,\bigg)## terms, in order to convert it as :

##\bigg(\dfrac{\partial \log(\Pi_k f_2(x_{k})}{\partial \theta_{i}}\bigg)\,\bigg(\dfrac{\partial \log(\Pi_l f(x_{l})}{\partial \theta_{j}}\bigg)##

?

Sorry if this is evident for some of you ...

2) Morevoer, to make vanish the second term of ##(6)##, I divide for each ##k-th## iteration of sum the total joint density probability by ##f(x_k)## : is it enough to justify that :

##E\bigg[\sum_{k}\dfrac{1}{f(x_{k})} \dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg]=\int \int{i\neq k} ...\int f(x_1) f(x_{i\neq k}) .. f(x_n)\,\bigg(\sum_{k}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\,dx_1 dx_2 ...dx_n = \sum_{k} \dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\bigg(\int \int ...\int f(x_1) f(x_{2}) .. f(x_n)\,\bigg)\,dx_1 dx_2 ...dx_n##

##=\sum_{k} \dfrac{\partial^{2} 1}{\partial \theta_{i} \partial \theta_{j}}=0##

The division by ##f(x_k)## is compensated with the multiplication by ##f(x_k)##, so this way, we keep the relation :

##\int \int ...\int f(x_1) f(x_{2}) .. f(x_n)\,dx_1 dx_2 ...dx_n=1##

don't we ?

Is this reasoning correct ?

Regards
 
Last edited:
fab13 said:
I don't know how to deal with the summation ##\sum_{k}## on ##
\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}}\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}\,\bigg)## terms, in order to convert it as :

##\bigg(\dfrac{\partial \log(\Pi_k f_2(x_{k})}{\partial \theta_{i}}\bigg)\,\bigg(\dfrac{\partial \log(\Pi_l f(x_{l})}{\partial \theta_{j}}\bigg)##

?

To do that directly would involve introducing terms that are zero so that ##\sum_k h(k)w(k) = (\sum_r h(r))(\sum_s w(s))## where terms of the form ##h(r)w(s)## are zero except when ##r = s##.

It's more natural to begin with the left side of 1) where the pattern ## (\sum_r h(r))(\sum_s w(s))## appears and show that the terms in that expression are zero when ##r \ne s##.

The basic ideas are that
##E (\ \dfrac{\partial log( f(x_k))}{\partial \theta_i}\ ) = 0##

For ##r \ne s##, ##E( \dfrac{\partial log(f(x_r))}{\partial \theta_i} \dfrac{\partial log(f(x_s))}{\partial \theta_j}) = E( \dfrac{\partial log(f(x_r))}{\partial \theta_i})E( \dfrac{\partial log(f(x_s))}{\partial \theta_j})## since ##x_r, x_s## are independent random variables.

So the only nonzero terms on the left side of 1) are those of the above form when ##r = s##.

2) Morevoer, to make vanish the second term of ##(6)##, I divide for each ##k-th## iteration of sum the total joint density probability by ##f(x_k)## : is it enough to justify that :

##E\bigg[\sum_{k}\dfrac{1}{f(x_{k})} \dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg]=\int \int{i\neq k} ...\int f(x_1) f(x_{i\neq k}) .. f(x_n)\,\bigg(\sum_{k}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\,dx_1 dx_2 ...dx_n = \sum_{k} \dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\bigg(\int \int ...\int f(x_1) f(x_{2}) .. f(x_n)\,\bigg)\,dx_1 dx_2 ...dx_n##

##=\sum_{k} \dfrac{\partial^{2} 1}{\partial \theta_{i} \partial \theta_{j}}=0##

The division by ##f(x_k)## is compensated with the multiplication by ##f(x_k)##, so this way, we keep the relation :

##\int \int ...\int f(x_1) f(x_{2}) .. f(x_n)\,dx_1 dx_2 ...dx_n=1##

don't we ?

Is this reasoning correct ?

Yes, I agree. However, it never hurts to check abstract arguments about summations by writing out a particular case such as ##n = 2##
 
Last edited:
  • #10
1)
Stephen Tashi said:
To do that directly would involve introducing terms that are zero so that ##\sum_k h(k)w(k) = (\sum_r h(r))(\sum_s w(s))## where terms of the form ##h(r)w(s)## are zero except when ##r = s##.

It's more natural to begin with the left side of 1) where the pattern ## (\sum_r h(r))(\sum_s w(s))## appears and show that the terms in that expression are zero when ##r \ne s##.

The basic ideas are that
##E (\ \dfrac{\partial log( f(x_k))}{\partial \theta_i}\ ) = 0##

I don't understand the last part, i.e when you say :

The basic ideas are that
##E (\ \dfrac{\partial log( f(x_k))}{\partial \theta_i}\ ) = 0##

Under which conditions have we got this expression ?

Moreover, when you talk about "left side of 1)", you talk about the calculation of left side of this relation :

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

i.e, ##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]## ?

The ideal would be to introduce a Kronecker symbol ##\delta_{rs}## to get this expression :

##\sum_k h(k)w(k) = (\sum_r h(r))(\sum_s w(s))\,\delta_{rs}##

But I don't know how to justify that ##h(r)\,w(s)=0## with ##r\neq s## or introduce a Kronecker symbol.

@Stephen Tashi : please if you could develop your reasoning, this woul be nice.

2)

On the other side, I understand well the relation :

##E( \dfrac{\partial log(f(x_r))}{\partial \theta_i} \dfrac{\partial log(f(x_s))}{\partial \theta_j}) = E( \dfrac{\partial log(f(x_r))}{\partial \theta_i})E( \dfrac{\partial log(f(x_s))}{\partial \theta_j})##

since ##\dfrac{\partial log(f(x_r))}{\partial \theta_i}## and ##\dfrac{\partial log(f(x_s))}{\partial \theta_j}## are indepedant variables.

3)

Just a little remark : Why I make things complicated in this demo ? :

##E\bigg[\sum_{k}\dfrac{1}{f(x_{k})} \dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg]=\int \int{i\neq k} ...\int f(x_1) f(x_{i\neq k}) .. f(x_n)\,\bigg(\sum_{k}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\,dx_1 dx_2 ...dx_n = \sum_{k} \dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\bigg(\int \int ...\int f(x_1) f(x_{2}) .. f(x_n)\,\bigg)\,dx_1 dx_2 ...dx_n##

I should have directly swap ##\sum_k## and ##E\bigg[...\bigg]##, this way, I could write directly :

##E\bigg[\sum_{k}\dfrac{1}{f(x_{k})} \dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg]=\sum_k\,E\bigg[...\bigg] =\sum_{k} \dfrac{\partial^{2} 1}{\partial \theta_{i} \partial \theta_{j}}=0##

Regards
 
  • #11
fab13 said:
Moreover, when you talk about "left side of 1)", you talk about the calculation of left side of this relation :

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

i.e, ##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]## ?

Yes.

I suggest you take the case of two observations ##x_1, x_2## and write out the left hand side of 1).

fab13 said:
I should have directly swap ∑k\sum_k and E

Yes.

Apply that idea to the left hand side of 1). You will be taking the expectation in sums that involve terms like:

##E(\frac{\partial log(f(x_1))}{\partial \theta_i} \frac{\partial log(f(x_2))}{\partial \theta_j} )##
## = E(\frac{\partial log(f(x_1))}{\partial \theta_i}) E( \frac{\partial log(f(x_2))}{\partial \theta_j} )##
## = (0)(0) = 0 ##
 
  • #12
@Stephen Tashi

I think I have finally understood, maybe by writing :

Given assimilating ##X_k## and ##Y_l## to :

##X_k=\dfrac{\partial log(f(x_k))}{\partial \theta_i}##
##Y_l=\dfrac{\partial log(f(x_l))}{\partial \theta_j}##

(##E[X_r]=0## and ## E[Y_s]=0)\,\implies\,(E[\sum_k X_k]=0## and ##E[\sum_l Y_l]=0##) ##\implies (E[\sum_k\sum_l\,X_k Y_l]=\sum_k\sum_l\,E[ X_k Y_l]=\sum_k\,E[ X_k Y_k] = E[\sum_k X_k Y_k])##,

since ##E[X_k\,Y_k])\neq E[X_k]\,E[Y_k]## (no more independence between ##X_k## and ##Y_k##).

Is this correct ?

thanks
 
Last edited:
  • Like
Likes   Reactions: Stephen Tashi
  • #13
fab13 said:
since ##E[X_k\,Y_k])\neq E[X_k]\,E[Y_k]## (no more independence between ##X_k## and ##Y_k##).

Is this correct ?

Yes.
 

Similar threads

  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 12 ·
Replies
12
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
1K
  • · Replies 0 ·
Replies
0
Views
1K
  • · Replies 5 ·
Replies
5
Views
4K
Replies
4
Views
2K