Derivative of Log Likelihood Function

NATURE.M · Nov 15, 2015

So looking through my notes I can't seem to understand how to get from one step to the next. I have attached a screenshot of the 2 lines I'm very confused about. Thanks.

BTW: The equations are for the log likelihood in a mixture of gaussians model

EDIT: To elaborate I am particularly confused about how they get numerator term π_{k} N(x_{n}|μ_{k}, Σ). I can't seem to understand how they are differentiating this to obtain that. I understand how they obtain the denominator term from differentiating the log but that's about all. To differentiate the multivariate gaussian I would think the log function needs to be used to break up the internal terms. Although I can't put this intuition together.

andrewkirk · Nov 15, 2015

I think it's because ##\Sigma_k## appears both inside and outside (as an inverse) the exponent in the cdf function ##\mathscr{N}##.
So
$$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)=
\frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
for known constant ##C## and function ##f##.
By the product rule, this is then equal to
$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

There will be some messy algebra involved.

You might find it easier to first work through the univariate case, differentiating wrt ##\sigma## and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.

NATURE.M · Nov 15, 2015

andrewkirk said:

I think it's because ##\Sigma_k## appears both inside and outside (as an inverse) the exponent in the cdf function ##\mathscr{N}##.
So
$$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)=
\frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
for known constant ##C## and function ##f##.
By the product rule, this is then equal to
$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

There will be some messy algebra involved.

You might find it easier to first work through the univariate case, differentiating wrt ##\sigma## and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.

I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ? Maybe you meant the same thing but forgot the determinant sign ?

andrewkirk · Nov 15, 2015

NATURE.M said:

I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ?

I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
which is
$$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

I think if you work through the univariate case first it'll become much clearer.

NATURE.M · Nov 15, 2015

andrewkirk said:

I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

$$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
which is
$$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

I think if you work through the univariate case first it'll become much clearer.

Okay so rewriting with exponents of -1/2 (for the gaussian) and repeating the operation we would have:
$$C\exp[f(\mu,\Sigma_k)]\left[-\frac{1}{2}|\Sigma_k|^{\frac{-3}{2}}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{\frac{-1}{2}}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
So the problem becomes the extra $$|\Sigma_k|^{-1}$$ that gets left over after we factor out $$|\Sigma_k|^{\frac{-1}{2}}$$ Any ideas ?

NATURE.M · Nov 16, 2015

So I think I resolved my troubles using a few properties outlined in the matrix cookbook.

Derivative of Log Likelihood Function

Attachments

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Derivative of Log Likelihood Function

Attachments

Similar threads