- #1

- 194

- 4

## Summary:

- I would like to get a rigorous demonstration of relation (1) given at the beginning of my post, implyint the Hessian and the general exporession of log-Likelihood.

## Main Question or Discussion Point

I would like to demonstrate the equation [itex](1)[/itex] below in the general form of the Log-likelihood :

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

with the [itex]\log[/itex] of Likelihood [itex]\mathcal{L}[/itex] defined by [itex]\mathcal{L} = \log\bigg(\Pi_{i} f(x_{i})\bigg)[/itex] with [itex]x_{i}[/itex] all experimental/observed values.

For the instant, if I start from the second derivative (left member of [itex](1)[/itex]), I can get :

##\dfrac{\partial \mathcal{L}}{\partial \theta_{i}} = \dfrac{\partial \log\big(\Pi_{k} f(x_{k})\big)}{\partial \theta_{i}} = \dfrac{\big(\partial \sum_{k} \log\,f(x_{k})\big)}{\partial \theta_{i}}

=\sum_{k} \dfrac{1}{f(x_{k})} \dfrac{\partial f(x_{k})}{\partial \theta_{i}}##

Now I have to compute : ##\dfrac{\partial^{2} \mathcal{L}}{\partial \theta_i \partial \theta_j}=\dfrac{\partial}{\partial \theta_j} \left(\sum_{k}\dfrac{1}{f(x_{k})}\,\dfrac{\partial f(x_{k})}{\partial \theta_{i}} \right)##

##= -\sum_{k} \bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{ \partial \theta_i \partial \theta_j}\bigg)##

##=-\sum_{k}\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}}

\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}+

\dfrac{1}{f(x_{k})}

\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)##

As we compute an expectation on both sides, the second term is vanishing to zero under regularity conditions, i.e :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}

\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

Second term can be expressed as :

##\int\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\text{d}x_k =\dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\int f(x_{k})\text{d}x_k=0##

since [itex]\int f(x_{k})\,\text{d}x_k = 1[/itex]

Finally, I get the relation :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\,\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

##=\int \sum_{k}\,f(x_k) \bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k\quad\quad(3)##

But I don't know to make equation [itex](3)[/itex] equal to :

##\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k

##

##=\int \sum_{k}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\sum_{l}\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k##

##=\int \sum_k f(x_k) \bigg(\dfrac{\partial \log(\Pi_{k}f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(\Pi_{l}f(x_{l})}{\partial \theta_{j}}\bigg)\,\text{d}x_k\quad\quad(4)##

##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

I just want to prove the equality between (3) and (4) : where is my error ?

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

This way, I could have :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\int \sum_{k} f_1(x_k,\theta_i)\quad f_2(x_k,\theta_j)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f_1(x_{k})}{\partial \theta_{j}}\dfrac{\partial f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i\theta_j##

instead of :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

So finally, I could obtain from equation (2) :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int \int f\sum_{k} f_1(x_k,\theta_i,\theta_j)\bigg(\dfrac{\partial \log(f_2(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j##

##=\int \int f_1(x_k,\theta_i) f_2(x_l,\theta_j) \bigg(\dfrac{\partial \sum_{k} \log(f_1(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \sum_l \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j\quad(5)##

##= \int\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j ##

##\int\int f_1(x_k, \theta_i, \theta_j) \bigg(\dfrac{\partial \log(\Pi_k f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(\Pi_l f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j##

##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

If somecone could help me, this would be nice.

Regards

##E\Big[\frac{\partial \mathcal{L}}{\partial \theta} \frac{\partial \mathcal{L}^{\prime}}{\partial \theta}\Big]=E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]\quad(1)##

with the [itex]\log[/itex] of Likelihood [itex]\mathcal{L}[/itex] defined by [itex]\mathcal{L} = \log\bigg(\Pi_{i} f(x_{i})\bigg)[/itex] with [itex]x_{i}[/itex] all experimental/observed values.

For the instant, if I start from the second derivative (left member of [itex](1)[/itex]), I can get :

##\dfrac{\partial \mathcal{L}}{\partial \theta_{i}} = \dfrac{\partial \log\big(\Pi_{k} f(x_{k})\big)}{\partial \theta_{i}} = \dfrac{\big(\partial \sum_{k} \log\,f(x_{k})\big)}{\partial \theta_{i}}

=\sum_{k} \dfrac{1}{f(x_{k})} \dfrac{\partial f(x_{k})}{\partial \theta_{i}}##

Now I have to compute : ##\dfrac{\partial^{2} \mathcal{L}}{\partial \theta_i \partial \theta_j}=\dfrac{\partial}{\partial \theta_j} \left(\sum_{k}\dfrac{1}{f(x_{k})}\,\dfrac{\partial f(x_{k})}{\partial \theta_{i}} \right)##

##= -\sum_{k} \bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{ \partial \theta_i \partial \theta_j}\bigg)##

##=-\sum_{k}\bigg(\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{i}}

\dfrac{\partial \log(f(x_{k}))}{\partial \theta_{j}}+

\dfrac{1}{f(x_{k})}

\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)##

As we compute an expectation on both sides, the second term is vanishing to zero under regularity conditions, i.e :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}

\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

Second term can be expressed as :

##\int\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\text{d}x_k =\dfrac{\partial^{2}}{\partial \theta_{i} \partial \theta_{j}}\int f(x_{k})\text{d}x_k=0##

since [itex]\int f(x_{k})\,\text{d}x_k = 1[/itex]

Finally, I get the relation :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\,\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

##=\int \sum_{k}\,f(x_k) \bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k\quad\quad(3)##

But I don't know to make equation [itex](3)[/itex] equal to :

##\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k

##

##=\int \sum_{k}f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}}\bigg)\sum_{l}\bigg(\dfrac{\partial \log(f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}x_k##

##=\int \sum_k f(x_k) \bigg(\dfrac{\partial \log(\Pi_{k}f(x_{k})}{\partial \theta_{i}}\bigg)\bigg(\dfrac{\partial \log(\Pi_{l}f(x_{l})}{\partial \theta_{j}}\bigg)\,\text{d}x_k\quad\quad(4)##

##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

I just want to prove the equality between (3) and (4) : where is my error ?

**IMPORTANT UPDATE : I realized that I made an error for the calculation of the expectation into equation [itex](2)[/itex], when I write :**##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{\partial \log(f(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \log(f(x_{k})}{\partial \theta_{j}}+\dfrac{1}{f(x_{k})}\dfrac{\partial^{2} f(x_{k})}{\partial \theta_{i} \partial \theta_{j}}\bigg)\text{d}x_k\quad\quad(2)##

**Indeed, I have not to integrate on the parameter [itex]\text{d}x_{k}[/itex] but rather on the variables [itex](\theta_i, \theta_j)[/itex].**

But if I do this, to compute the expectation, it seems that I need the joint distribution [itex](f(x_k) = f(x_k, \theta_i, \theta_j)[/itex].

So I guess we can rewrite this joint distribution like if [itex]\theta_i[/itex] and [itex]\theta_j[/itex] were independent, i.e :

##f(x_k, \theta_i, \theta_j)= f_1(x_k, \theta_i)\quad f_2(x_k, \theta_j)##But if I do this, to compute the expectation, it seems that I need the joint distribution [itex](f(x_k) = f(x_k, \theta_i, \theta_j)[/itex].

So I guess we can rewrite this joint distribution like if [itex]\theta_i[/itex] and [itex]\theta_j[/itex] were independent, i.e :

##f(x_k, \theta_i, \theta_j)= f_1(x_k, \theta_i)\quad f_2(x_k, \theta_j)##

This way, I could have :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\int \sum_{k} f_1(x_k,\theta_i)\quad f_2(x_k,\theta_j)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f_1(x_{k})}{\partial \theta_{j}}\dfrac{\partial f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i\theta_j##

instead of :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int\sum_{k} f(x_k)\bigg(\dfrac{1}{f(x_{k})^2} \dfrac{\partial f(x_{k})}{\partial \theta_{j}}\dfrac{\partial f(x_{k})}{\partial \theta_{i}}\bigg)\text{d}x_k##

So finally, I could obtain from equation (2) :

##E\Big[\frac{-\partial^{2} \mathcal{L}}{\partial \theta \partial \theta^{\prime}}\Big]=\int \int f\sum_{k} f_1(x_k,\theta_i,\theta_j)\bigg(\dfrac{\partial \log(f_2(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j##

##=\int \int f_1(x_k,\theta_i) f_2(x_l,\theta_j) \bigg(\dfrac{\partial \sum_{k} \log(f_1(x_{k})}{\partial \theta_{i}},

\dfrac{\partial \sum_l \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_i \text{d}\theta_j\quad(5)##

##= \int\int \sum_{k}\sum_{l}f(x_k)\bigg(\dfrac{\partial \log(f_1(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(f_2(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j ##

##\int\int f_1(x_k, \theta_i, \theta_j) \bigg(\dfrac{\partial \log(\Pi_k f_2(x_{k})}{\partial \theta_{i}}\bigg)\text{d}\theta_i \bigg(\dfrac{\partial \log(\Pi_l f(x_{l})}{\partial \theta_{j}}\bigg)\text{d}\theta_j##

##=E\Big[\dfrac{\partial \mathcal{L}}{\partial \theta_i} \dfrac{\partial \mathcal{L}}{\partial \theta_j}\Big]##

**But I have difficulties to the step implying the 2 sums into equation[itex](5)[/itex]: [itex]\sum_k[/itex] and [itex]\sum_l[/itex] that I introduce above without justification ? Moreover, are my calculations of expectation are correct (I mean integrate over [itex]\theta_i[/itex] and****[itex]\theta_j[/itex] ?**If somecone could help me, this would be nice.

Regards