Derivative of Log Likelihood Function

1. Nov 15, 2015

NATURE.M

So looking through my notes I cant seem to understand how to get from one step to the next. I have attached a screenshot of the 2 lines I'm very confused about. Thanks.

BTW: The equations are for the log likelihood in a mixture of gaussians model

EDIT: To elaborate I am particularly confused about how they get numerator term π_{k} N(x_{n}|μ_{k}, Σ). I can't seem to understand how they are differentiating this to obtain that. I understand how they obtain the denominator term from differentiating the log but thats about all. To differentiate the multivariate gaussian I would think the log function needs to be used to break up the internal terms. Although I cant put this intuition together.

Attached Files:

• Screen Shot 2015-11-15 at 5.37.12 PM.png
File size:
8.2 KB
Views:
67
Last edited: Nov 15, 2015
2. Nov 15, 2015

andrewkirk

I think it's because $\Sigma_k$ appears both inside and outside (as an inverse) the exponent in the cdf function $\mathscr{N}$.
So
$$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)= \frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
for known constant $C$ and function $f$.
By the product rule, this is then equal to
$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

There will be some messy algebra involved.

You might find it easier to first work through the univariate case, differentiating wrt $\sigma$ and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.

3. Nov 15, 2015

NATURE.M

I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ? Maybe you meant the same thing but forgot the determinant sign ?

Last edited: Nov 15, 2015
4. Nov 15, 2015

andrewkirk

I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
which is
$$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

I think if you work through the univariate case first it'll become much clearer.

5. Nov 15, 2015

NATURE.M

Okay so rewriting with exponents of -1/2 (for the gaussian) and repeating the operation we would have:
$$C\exp[f(\mu,\Sigma_k)]\left[-\frac{1}{2}|\Sigma_k|^{\frac{-3}{2}}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{\frac{-1}{2}}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
So the problem becomes the extra $$|\Sigma_k|^{-1}$$ that gets left over after we factor out $$|\Sigma_k|^{\frac{-1}{2}}$$ Any ideas ?

6. Nov 16, 2015

NATURE.M

So I think I resolved my troubles using a few properties outlined in the matrix cookbook.