Maximizing the Log-Likelihood Function

shotputer · Sep 10, 2008

Lagrange mult. ---finding max

Homework Statement [/b]

probability mass function is given by
p(x₁,...,x_k; n, p1,... pk) := log (n!/x1!...xk!) p₁^x₁ p₂^x_k

Here, n is a fixed strictly positive integer, x_i E Z+ for 1 < i < k, [tex]\Sigma[/tex] x_i=n, 0 <p_i <1, and [tex]\Sigma[/tex] p_i=1

The maximum log-likelihood estimation problem is to find:

arg max log p(x₁,...,x_k; n; p₁...p_k)

over all possible choices of (p₁ ...p_k) E R^k such that
[tex]\Sigma[/tex] p _i=1
(Hint: You have no control over x1,...,xk
or n and may regard them as given.)

Homework Equations

The Attempt at a Solution

Well I know that I need to find first derivative of function p, and of function "g"- constraint, but I don't know where, or how to actually start...

Please help, thank you

Dick · Sep 11, 2008

You want to use Stirling's approximation to replace the factorials with functions you can differentiate. Yeah, then just do Lagrange multipliers with the constraint that the sum of the probabilities is 1.

shotputer · Sep 11, 2008

ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just [tex]\lambda[/tex], for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x₁!...x_k!) p₁^x₁ p₂^x_k is the same as
log (n!/x₁!...x_k!) + log p₁^x₁ + log p₂^x_k

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you

Dick · Sep 11, 2008

shotputer said:

ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just [tex]\lambda[/tex], for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x₁!...x_k!) p₁^x₁ p₂^x_k is the same as
log (n!/x₁!...x_k!) + log p₁^x₁ + log p₂^x_k

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you

Right. Apparently, you can take the first term as a constant. Forget the Stirlings formula thing, I was thinking of a different problem.

shotputer · Sep 11, 2008

So,

this is what I'm getting:
1/ p₁^x₁ = [tex]\lambda[/tex]
1/ p₂^x_k = [tex]\lambda[/tex]

So, p₁^x₁=p₂^x_k

I don't know, it doesn't seem wright, ha?

shotputer · Sep 11, 2008

Sorry, little mistake in last reply

1/( p1^x1) ln10 = [tex]\lambda[/tex]
1/ ( p2^xk)ln10 = [tex]\lambda[/tex]

But it doesn't make any difference...

Dick · Sep 11, 2008

shotputer said:

So,

this is what I'm getting:
1/ p₁^x₁ = [tex]\lambda[/tex]
1/ p₂^x_k = [tex]\lambda[/tex]

So, p₁^x₁=p₂^x_k

I don't know, it doesn't seem wright, ha?

You are right. It's wrong. d/dp(log(p^x)) is NOT 1/p^x. It's 1/(p^x)*d(p^x)/dp. You need to use the chain rule. But it's actually easier to use rules of logs first. log(p^x)=x*log(p), right?

shotputer · Sep 11, 2008

Yep, absolutely,

So then:

x₁/ (p₁ ln10) =[tex]\lambda[/tex]
x_k/ (p_k ln10) =[tex]\lambda[/tex]

and x₁/p₁=x_k/p_k

so p_k/p₁=x_k/x₁

or p_k= p₁ (x_k/x₁)

that seems little bit better?

Dick · Sep 11, 2008

shotputer said:

Yep, absolutely,

So then:

x₁/ (p₁ ln10) =[tex]\lambda[/tex]
x_k/ (p_k ln10) =[tex]\lambda[/tex]

and x₁/p₁=x_k/p_k

so p_k/p₁=x_k/x₁

or p_k= p₁ (x_k/x₁)

that seems little bit better?

LOTS better.

shotputer · Sep 11, 2008

Thanks a lot!

But, hm, what now

, can I just leave it like this, and say that when p_k= p₁ (x_k/x) this function is maximized?

Dick · Sep 11, 2008

You have x_i=constant*p_i, right? Sum both sides over i to determine the constant.

shotputer · Sep 11, 2008

sorry, but I don't understand what are you trying to say?

Dick · Sep 11, 2008

x1=constant*p1, x2=constant*p2, etc. Add them all up. You get that the sum of the x_i's is the constant times sum of the p_i's. Sum of the x_i's is n. Sum of the p_i's is 1. Hence n=constant*1. So constant=n. So x_i=n*p_i. That's what I was trying to say in way too many words.

shotputer · Sep 11, 2008

great, thanks thanks a LOT!

shotputer · Sep 11, 2008

I thought that I get it, but hm, I didn't. Well I thought that I am supposed to find p_i's at which is this function maximized. Because problem says that I don't have a control over x's or n?

So I understand what you said in your last reply, but is that my final answer then?

Dick · Sep 11, 2008

Well, pi=xi/n, right? If you know n and the xi's then you know the pi's that make for an extremum. The solution is actually telling you something amazingly obvious. For example, if there is only x1 and x2 and x1/n=1/3 and x2/n=2/3 then the probability associated with x1 is likely 1/3 and the probability associated with x2 is likely 2/3. You probably would have guessed that, right?

shotputer · Sep 11, 2008

ugh, yea that was obvious! :) well thanks again! it help me a lot! i can already see that i'll have more questions for this class, well let's wait for next hw...

Maximizing the Log-Likelihood Function

Homework Equations

The Attempt at a Solution

FAQ: Maximizing the Log-Likelihood Function

What is the Lagrange multiplier method?

Why is the Lagrange multiplier method used to find maximum values?

What is the formula for the Lagrange multiplier method?

How do you find the Lagrange multiplier?

What are the limitations of the Lagrange multiplier method?

Similar threads

Hot Threads

Recent Insights