- #1
pluviosilla
- 17
- 0
Why is the Maximum Likelihood function a product?
Explanations of how the Maximum Likelihood function is constructed usually just mention that events are independent and so the probability of several such events is just the product of the separate probabilities. I get the logic w.r.t. independent events. What I don't get is how we can just assume events are independent and (what appears to me in my confusion to be) the disregard of combinations in calculating the probability.
For example, suppose I want to know the probability of 4 heads and 7 tails in 11 coin flips, I may NOT exclude the binomial coefficient in my calculation.
Correct: [itex]P(4, 7) = \binom{N}{k}\cdot p^kq^{N-k} = \binom{11}{4}\cdot p^4q^{7} [/itex]
Incorrect: [itex]P(4, 7) = p^4q^{7}[/itex] <<< gives probability of a rigid sequence of 4 heads 7 tails
The reasoning, I take it, is that we take the product of the the respective probabilities of each bit of data in the sample statistic, then we use calculus to determine which population parameters maximize this probability, ignoring combinations because they just constitute a constant coefficient that does not effect the calculation of the maximum.
QUESTION: How can we simply assume events are independent?
QUESTION: Am I right to assume we ignore combinations because they only add a constant coefficient which does not affect the calculation of the maximum?
QUESTION: Are there no practical consequences (problems) arising from dismissing the combinations? With a large sample size we get a probability that is extremely (and artificially) small.
Examples of MLE discussion include the **marvelous** on-line lectures in econometrics by Ben Lambert here and here. This guy gives very accessible, very lucid lectures, but I am still puzzled by the aforementioned points:
Disclaimer: this is NOT homework and I am NOT enrolled in anyone's class.
Explanations of how the Maximum Likelihood function is constructed usually just mention that events are independent and so the probability of several such events is just the product of the separate probabilities. I get the logic w.r.t. independent events. What I don't get is how we can just assume events are independent and (what appears to me in my confusion to be) the disregard of combinations in calculating the probability.
For example, suppose I want to know the probability of 4 heads and 7 tails in 11 coin flips, I may NOT exclude the binomial coefficient in my calculation.
Correct: [itex]P(4, 7) = \binom{N}{k}\cdot p^kq^{N-k} = \binom{11}{4}\cdot p^4q^{7} [/itex]
Incorrect: [itex]P(4, 7) = p^4q^{7}[/itex] <<< gives probability of a rigid sequence of 4 heads 7 tails
The reasoning, I take it, is that we take the product of the the respective probabilities of each bit of data in the sample statistic, then we use calculus to determine which population parameters maximize this probability, ignoring combinations because they just constitute a constant coefficient that does not effect the calculation of the maximum.
QUESTION: How can we simply assume events are independent?
QUESTION: Am I right to assume we ignore combinations because they only add a constant coefficient which does not affect the calculation of the maximum?
QUESTION: Are there no practical consequences (problems) arising from dismissing the combinations? With a large sample size we get a probability that is extremely (and artificially) small.
Examples of MLE discussion include the **marvelous** on-line lectures in econometrics by Ben Lambert here and here. This guy gives very accessible, very lucid lectures, but I am still puzzled by the aforementioned points:
Disclaimer: this is NOT homework and I am NOT enrolled in anyone's class.