Maximum likelyhood extimater(MLE)

  • Thread starter Thread starter semidevil
  • Start date Start date
  • Tags Tags
    Maximum
AI Thread Summary
The discussion focuses on understanding the calculations involved in the Maximum Likelihood Estimator (MLE) and the likelihood function. The likelihood function is derived from the product of the probability density functions evaluated at multiple data points. Participants express confusion over the transition from individual likelihoods to the overall likelihood, particularly regarding the introduction of indices and the simplification of terms. Clarifications are provided on how to handle products and sums in the context of likelihood calculations, including the transformation to the logarithmic form of the likelihood. The conversation highlights the importance of recognizing patterns in mathematical expressions to grasp the underlying concepts of MLE.
semidevil
Messages
156
Reaction score
2
so by definition, the likelyhood function(w, theta) is the product of the pdf fw(w, theta) evalutated at n data points.

but I don't know how they do those calculations...

so for example:

fy(y, theta) = 1/\theta^2 ye^{-y/\theta}

L(theta) = \theta^{-2n}\prod y_{i}e^{-1/\theta}\sum y_{i}

so first of all, I'm looking at this but I don't know how they went from this to that...I look at another problme

how did they go from e^{-(y-\theta)} [\tex]to \pro e^{-(y_{i}- \theta)} [/theta]<br /> <br /> I dotn see a pattern...I compared it w/ the definitoin, but I just don&#039;t get it...<br /> I mean, when they did the L(theta) it seems that they added some &quot;n&#039; and i&#039;s somewhere...and I dotn know where they added these things.
 
Last edited:
Physics news on Phys.org
semidevil said:
so by definition, the likelyhood function(w, theta) is the product of the pdf fw(w, theta) evalutated at n data points.

but I don't know how they do those calculations...

so for example:

fy(y, theta) = 1/\theta^2 ye^{-y/\theta}

L(theta) = \theta^{-2n}\prod y_{i}e^{-1/\theta}\sum y_{i}

so first of all, I'm looking at this but I don't know how they went from this to that...I look at another problme

how did they go from e^{-(y-\theta)} [\tex]to \pro e^{-(y_{i}- \theta)} [/theta]<br /> <br /> I dotn see a pattern...I compared it w/ the definitoin, but I just don&#039;t get it...<br /> I mean, when they did the L(theta) it seems that they added some &quot;n&#039; and i&#039;s somewhere...and I dotn know where they added these things.
<br /> <br /> It looks like there are some typos in your expression.<br /> <br /> You are trying to estimate \theta using n data points, labelled as y_i. The single likelihood for \theta given one data pointy_i is:<br /> <br /> 1/\theta^2 y_{i}e^{-y_{i}/\theta}<br /> <br /> In order to get the likelihood of \thetafor all n data points, then you need to multiply the single likelihoods together. And that&#039;s just a simple matter of multiplying terms and summing up what&#039;s in the exponential term.
 
The indices are a shorthand for indeterminately long products. For example, given your definition of L, we evaluate the pdf at n values of y, {y1, y2, ..., yn} and take the product. So if
f(y) = \frac{1}{\theta^2} ye^{-\frac{y}{\theta}}
we get
f(y_1)f(y_2)...f(y_n) &amp;= \prod_{i=1}^n f(y_i)

= \prod_{i=1}^n \frac{1}{\theta^2} y_i e^{-\frac{y_i}{\theta}}

= \frac{1}{\theta^{2n}} \prod_{i=1}^n y_i e^{-\frac{y_i}{\theta}}

= \frac{1}{\theta^{2n}} \left(y_1 e^{-\frac{y_1}{\theta}}...y_n e^{-\frac{y_n}{\theta}}\right)

= \frac{1}{\theta^{2n}} \left(y_1...y_n e^{-\frac{y_1+y_2+...+y_n}{\theta}}\right)

= \frac{1}{\theta^{2n}} \left(\prod_{i=1}^n y_i\right) \left(e^{-\frac{1}{\theta}\sum_{i=1}^n y_i}}\right)
 
ok, thanx, now that makes a little bit more sense, but i'll think about it some more...still a bit confusing.

what about to get ln L(theta)? the book does some weird stuff an I don't know what it did.

ln L(theta) = -2n ln \theta + ln \prod yi - 1/\theta \sum yi


how did that happen?

and also, I'm not understanding where they put the n's. maybe I'm having trouble w/ the definition. like, on the first problem, why did it become \theta^{-2n}?

and for ths problem,

we have fy (y, theta) = e^{-(y-\theta)}, so L(theta) = e^{\sum y + n\theta}
 
Last edited:
semidevil said:
and also, I'm not understanding where they put the n's. maybe I'm having trouble w/ the definition. like, on the first problem, why did it become \theta^{-2n}?
Because \theta^{-2} multiplied by itself n times is \theta^{-2n}. They just took it out of the n-product by associativity.

semidevil said:
we have fy (y, theta) = e^{-(y-\theta)}, so L(theta) = e^{\sum y + n\theta}
What happens when you multiply ea with eb ? That's all that's going on here. :)
 
Kindly see the attached pdf. My attempt to solve it, is in it. I'm wondering if my solution is right. My idea is this: At any point of time, the ball may be assumed to be at an incline which is at an angle of θ(kindly see both the pics in the pdf file). The value of θ will continuously change and so will the value of friction. I'm not able to figure out, why my solution is wrong, if it is wrong .
TL;DR Summary: I came across this question from a Sri Lankan A-level textbook. Question - An ice cube with a length of 10 cm is immersed in water at 0 °C. An observer observes the ice cube from the water, and it seems to be 7.75 cm long. If the refractive index of water is 4/3, find the height of the ice cube immersed in the water. I could not understand how the apparent height of the ice cube in the water depends on the height of the ice cube immersed in the water. Does anyone have an...
Back
Top