Maximizing the Log-Likelihood Function

  • Thread starter Thread starter shotputer
  • Start date Start date
  • Tags Tags
    Lagrange Max
shotputer
Messages
11
Reaction score
0
Lagrange mult. ---finding max

Homework Statement [/b]

probability mass function is given by
p(x1,...,xk; n, p1,... pk) := log (n!/x1!...xk!) p1^x1 p2^xk

Here, n is a fixed strictly positive integer, xi E Z+ for 1 < i < k, \Sigma xi=n, 0 <pi <1, and \Sigma pi=1

The maximum log-likelihood estimation problem is to find:

arg max log p(x1,...,xk; n; p1...pk)

over all possible choices of (p1 ...pk) E Rk such that
\Sigma p i=1
(Hint: You have no control over x1,...,xk
or n and may regard them as given.)

Homework Equations





The Attempt at a Solution


Well I know that I need to find first derivative of function p, and of function "g"- constraint, but I don't know where, or how to actually start...

Please help, thank you
 
Physics news on Phys.org


You want to use Stirling's approximation to replace the factorials with functions you can differentiate. Yeah, then just do Lagrange multipliers with the constraint that the sum of the probabilities is 1.
 


ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just \lambda, for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x1!...xk!) p1^x1 p2^xk is the same as
log (n!/x1!...xk!) + log p1^x1 + log p2^xk

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you
 


shotputer said:
ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just \lambda, for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x1!...xk!) p1^x1 p2^xk is the same as
log (n!/x1!...xk!) + log p1^x1 + log p2^xk

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you

Right. Apparently, you can take the first term as a constant. Forget the Stirlings formula thing, I was thinking of a different problem.
 


So,

this is what I'm getting:
1/ p1^x1 = \lambda
1/ p2^xk = \lambda


So, p1^x1=p2^xk

I don't know, it doesn't seem wright, ha?
 


Sorry, little mistake in last reply

1/( p1^x1) ln10 = \lambda
1/ ( p2^xk)ln10 = \lambda

But it doesn't make any difference...
 


shotputer said:
So,

this is what I'm getting:
1/ p1^x1 = \lambda
1/ p2^xk = \lambda


So, p1^x1=p2^xk

I don't know, it doesn't seem wright, ha?

You are right. It's wrong. d/dp(log(p^x)) is NOT 1/p^x. It's 1/(p^x)*d(p^x)/dp. You need to use the chain rule. But it's actually easier to use rules of logs first. log(p^x)=x*log(p), right?
 


Yep, absolutely,

So then:

x1/ (p1 ln10) =\lambda
xk/ (pk ln10) =\lambda

and x1/p1=xk/pk

so pk/p1=xk/x1

or pk= p1 (xk/x1)

that seems little bit better?
 


shotputer said:
Yep, absolutely,

So then:

x1/ (p1 ln10) =\lambda
xk/ (pk ln10) =\lambda

and x1/p1=xk/pk

so pk/p1=xk/x1

or pk= p1 (xk/x1)

that seems little bit better?

LOTS better.
 
  • #10


Thanks a lot!

But, hm, what now :confused:, can I just leave it like this, and say that when pk= p1 (xk/x) this function is maximized?
 
  • #11


You have x_i=constant*p_i, right? Sum both sides over i to determine the constant.
 
  • #12


:confused:
sorry, but I don't understand what are you trying to say?
 
  • #13


x1=constant*p1, x2=constant*p2, etc. Add them all up. You get that the sum of the x_i's is the constant times sum of the p_i's. Sum of the x_i's is n. Sum of the p_i's is 1. Hence n=constant*1. So constant=n. So x_i=n*p_i. That's what I was trying to say in way too many words.
 
  • #14


great, thanks thanks a LOT!
 
  • #15


I thought that I get it, but hm, I didn't. Well I thought that I am supposed to find p_i's at which is this function maximized. Because problem says that I don't have a control over x's or n?

So I understand what you said in your last reply, but is that my final answer then?
 
  • #16


Well, pi=xi/n, right? If you know n and the xi's then you know the pi's that make for an extremum. The solution is actually telling you something amazingly obvious. For example, if there is only x1 and x2 and x1/n=1/3 and x2/n=2/3 then the probability associated with x1 is likely 1/3 and the probability associated with x2 is likely 2/3. You probably would have guessed that, right?
 
  • #17


ugh, yea that was obvious! :) well thanks again! it help me a lot! i can already see that i'll have more questions for this class, well let's wait for next hw...
 

Similar threads

Back
Top