Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Derivative of Log Likelihood Function

  1. Nov 15, 2015 #1
    So looking through my notes I cant seem to understand how to get from one step to the next. I have attached a screenshot of the 2 lines I'm very confused about. Thanks.

    BTW: The equations are for the log likelihood in a mixture of gaussians model

    EDIT: To elaborate I am particularly confused about how they get numerator term π_{k} N(x_{n}|μ_{k}, Σ). I can't seem to understand how they are differentiating this to obtain that. I understand how they obtain the denominator term from differentiating the log but thats about all. To differentiate the multivariate gaussian I would think the log function needs to be used to break up the internal terms. Although I cant put this intuition together.

    Attached Files:

    Last edited: Nov 15, 2015
  2. jcsd
  3. Nov 15, 2015 #2


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    I think it's because ##\Sigma_k## appears both inside and outside (as an inverse) the exponent in the cdf function ##\mathscr{N}##.
    $$\frac{\partial}{\partial \Sigma_k}\mathscr{N}(\mu,\Sigma_k)=
    \frac{\partial}{\partial \Sigma_k}\left[C\Sigma_k{}^{-1}\exp[f(\mu,\Sigma_k)]\right]$$
    for known constant ##C## and function ##f##.
    By the product rule, this is then equal to
    $$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}\Sigma_k{}^{-1}+\Sigma_k{}^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

    There will be some messy algebra involved.

    You might find it easier to first work through the univariate case, differentiating wrt ##\sigma## and seeing if you can obtain an analogous expression. If that works out, it shouldn't be too hard to extend it to the multivar case.
  4. Nov 15, 2015 #3
    I don't understand how you got $$C\Sigma_{k}^{-1}$$ In the multivariate gaussian we have $$\frac{1}{|\Sigma_{k}|}$$ How did you convert that determinant into an inverse ? Maybe you meant the same thing but forgot the determinant sign ?
    Last edited: Nov 15, 2015
  5. Nov 15, 2015 #4


    User Avatar
    Science Advisor
    Homework Helper
    Gold Member

    I didn't. What I wrote is only broadly indicative of the structure. I didn't look up the multivariate Gaussian formula. With your correction that line becomes:

    $$C\exp[f(\mu,\Sigma_k)]\left[\frac{\partial}{\partial \Sigma_k}|\Sigma_k|^{-1}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
    which is
    $$C\exp[f(\mu,\Sigma_k)]\left[-|\Sigma_k|^{-2}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{-1}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$

    I think if you work through the univariate case first it'll become much clearer.
  6. Nov 15, 2015 #5
    Okay so rewriting with exponents of -1/2 (for the gaussian) and repeating the operation we would have:
    $$C\exp[f(\mu,\Sigma_k)]\left[-\frac{1}{2}|\Sigma_k|^{\frac{-3}{2}}\frac{\partial |\Sigma_k|}{\partial \Sigma_k}+|\Sigma_k|^{\frac{-1}{2}}\frac{\partial}{\partial \Sigma_k}f(\mu,\Sigma_k)]\right]$$
    So the problem becomes the extra $$|\Sigma_k|^{-1}$$ that gets left over after we factor out $$|\Sigma_k|^{\frac{-1}{2}}$$ Any ideas ?
  7. Nov 16, 2015 #6
    So I think I resolved my troubles using a few properties outlined in the matrix cookbook.
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook