Maximizing the Log-Likelihood Function

In summary, the homework statement is to find the maximum log-likelihood estimation problem. The maximum log-likelihood estimation problem is to find the arg max log p(x1,...,xk; n; p1...pk) over all possible choices of (p1 ...pk) E Rk such that \Sigma p_i=1.
  • #1
shotputer
11
0
Lagrange mult. ---finding max

Homework Statement [/b]

probability mass function is given by
p(x1,...,xk; n, p1,... pk) := log (n!/x1!...xk!) p1^x1 p2^xk

Here, n is a fixed strictly positive integer, xi E Z+ for 1 < i < k, [tex]\Sigma[/tex] xi=n, 0 <pi <1, and [tex]\Sigma[/tex] pi=1

The maximum log-likelihood estimation problem is to find:

arg max log p(x1,...,xk; n; p1...pk)

over all possible choices of (p1 ...pk) E Rk such that
[tex]\Sigma[/tex] p i=1
(Hint: You have no control over x1,...,xk
or n and may regard them as given.)

Homework Equations





The Attempt at a Solution


Well I know that I need to find first derivative of function p, and of function "g"- constraint, but I don't know where, or how to actually start...

Please help, thank you
 
Physics news on Phys.org
  • #2


You want to use Stirling's approximation to replace the factorials with functions you can differentiate. Yeah, then just do Lagrange multipliers with the constraint that the sum of the probabilities is 1.
 
  • #3


ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just [tex]\lambda[/tex], for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x1!...xk!) p1^x1 p2^xk is the same as
log (n!/x1!...xk!) + log p1^x1 + log p2^xk

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you
 
  • #4


shotputer said:
ok, so this is what I have so far:

on the right hand side when I differentiate constraint I get just [tex]\lambda[/tex], for every p.

But I still didn't figured out how to diff the function. So I have something like

log (n!/x1!...xk!) p1^x1 p2^xk is the same as
log (n!/x1!...xk!) + log p1^x1 + log p2^xk

And now I am stuck again with that first log with factorials! Do i need to differentiate that at all? Because in hint says :You have no control over x1,...,xk
or n and may regard them as given...

thank you

Right. Apparently, you can take the first term as a constant. Forget the Stirlings formula thing, I was thinking of a different problem.
 
  • #5


So,

this is what I'm getting:
1/ p1^x1 = [tex]\lambda[/tex]
1/ p2^xk = [tex]\lambda[/tex]


So, p1^x1=p2^xk

I don't know, it doesn't seem wright, ha?
 
  • #6


Sorry, little mistake in last reply

1/( p1^x1) ln10 = [tex]\lambda[/tex]
1/ ( p2^xk)ln10 = [tex]\lambda[/tex]

But it doesn't make any difference...
 
  • #7


shotputer said:
So,

this is what I'm getting:
1/ p1^x1 = [tex]\lambda[/tex]
1/ p2^xk = [tex]\lambda[/tex]


So, p1^x1=p2^xk

I don't know, it doesn't seem wright, ha?

You are right. It's wrong. d/dp(log(p^x)) is NOT 1/p^x. It's 1/(p^x)*d(p^x)/dp. You need to use the chain rule. But it's actually easier to use rules of logs first. log(p^x)=x*log(p), right?
 
  • #8


Yep, absolutely,

So then:

x1/ (p1 ln10) =[tex]\lambda[/tex]
xk/ (pk ln10) =[tex]\lambda[/tex]

and x1/p1=xk/pk

so pk/p1=xk/x1

or pk= p1 (xk/x1)

that seems little bit better?
 
  • #9


shotputer said:
Yep, absolutely,

So then:

x1/ (p1 ln10) =[tex]\lambda[/tex]
xk/ (pk ln10) =[tex]\lambda[/tex]

and x1/p1=xk/pk

so pk/p1=xk/x1

or pk= p1 (xk/x1)

that seems little bit better?

LOTS better.
 
  • #10


Thanks a lot!

But, hm, what now :confused:, can I just leave it like this, and say that when pk= p1 (xk/x) this function is maximized?
 
  • #11


You have x_i=constant*p_i, right? Sum both sides over i to determine the constant.
 
  • #12


:confused:
sorry, but I don't understand what are you trying to say?
 
  • #13


x1=constant*p1, x2=constant*p2, etc. Add them all up. You get that the sum of the x_i's is the constant times sum of the p_i's. Sum of the x_i's is n. Sum of the p_i's is 1. Hence n=constant*1. So constant=n. So x_i=n*p_i. That's what I was trying to say in way too many words.
 
  • #14


great, thanks thanks a LOT!
 
  • #15


I thought that I get it, but hm, I didn't. Well I thought that I am supposed to find p_i's at which is this function maximized. Because problem says that I don't have a control over x's or n?

So I understand what you said in your last reply, but is that my final answer then?
 
  • #16


Well, pi=xi/n, right? If you know n and the xi's then you know the pi's that make for an extremum. The solution is actually telling you something amazingly obvious. For example, if there is only x1 and x2 and x1/n=1/3 and x2/n=2/3 then the probability associated with x1 is likely 1/3 and the probability associated with x2 is likely 2/3. You probably would have guessed that, right?
 
  • #17


ugh, yea that was obvious! :) well thanks again! it help me a lot! i can already see that i'll have more questions for this class, well let's wait for next hw...
 

FAQ: Maximizing the Log-Likelihood Function

What is the Lagrange multiplier method?

The Lagrange multiplier method is a mathematical technique used to find the maximum or minimum value of a function subject to a set of constraints. It involves using a parameter, known as the Lagrange multiplier, to incorporate the constraints into the objective function.

Why is the Lagrange multiplier method used to find maximum values?

The Lagrange multiplier method is used to find maximum values because it takes into account any constraints that may affect the maximum value of a function. This allows for a more accurate and precise calculation of the maximum value.

What is the formula for the Lagrange multiplier method?

The formula for the Lagrange multiplier method is: ∇f(x,y) = λ∇g(x,y), where ∇f(x,y) represents the gradient of the objective function, ∇g(x,y) represents the gradient of the constraint function, and λ is the Lagrange multiplier.

How do you find the Lagrange multiplier?

The Lagrange multiplier can be found by setting the gradient of the objective function equal to the gradient of the constraint function and solving for λ. This value of λ can then be used in the formula to find the maximum value of the function.

What are the limitations of the Lagrange multiplier method?

The Lagrange multiplier method may not work for all types of constraints, such as non-differentiable constraints. It also may not always provide a unique solution, as there can be multiple sets of constraints that result in the same maximum value. Additionally, the method can become computationally complex for higher dimensional problems.

Back
Top