Why is likelihood function defined as such?

AI Thread Summary
The discussion centers on the definition of the likelihood function in maximum likelihood estimation, questioning why it is expressed as f(x|θ) instead of f(θ|x). The argument highlights that, in frequentist statistics, parameters are treated as fixed but unknown, which complicates the direct computation of f(θ|x) without a prior distribution. It emphasizes that frequentist methods provide probabilities based on assumed models rather than directly answering the probability of hypotheses given data. The conversation also suggests that to compute f(θ|x), one must adopt a Bayesian approach, which aligns more closely with intuitive reasoning about data and hypotheses. Overall, the thread critiques the frequentist framework for its backward approach to statistical inference.
CantorSet
Messages
44
Reaction score
0
Hi everyone,

This is not a homework question but something I thought of while reading.

In the method of maximum likelihood estimation, they're trying to maximize the likelihood function
f(\vec{x}| \theta ) with respect to \theta. But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?
 
Physics news on Phys.org
CantorSet said:
But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?

Philosophically, I think you're right. But in frequentist statistics ( the usual kind taught in introductory statistics courses) no probability distribution is ever assumed for the parameters of a distribution. When we don't know these parameters, we say they have "definite but unknown values". If you don't have a "prior" probability distribution for the parameters, you can't compute the "posterior" distribution f(\theta|\vec{x}).

My personal view of frequentist statistics is this: The commonsense person's outlook is "I have certain ideas. Given the data, what is the probability that my ideas are true?" Frequentist statistics answers "I assumed certain ideas and calculated the probability of the data. Based on that probability, I will stipluate some decisions." This evades the question! Things are done backwards in comparison to what people naturally want to know.

The authoritative terminology used ("statistical significance" "rejection" of a hypothesis, "confidence" intervals) makes many laymen think that they are getting information about the probability that some idea is true given the data. But if you look under the hood at what's happening, what your are getting is a quantification of the probability of the data based on the assumption that certain ideas are true.

I'm not trying to say that frequentist statistics isn't effective. But how to apply it requires empirical trial and error. People observe that certain methods work in certain types of situations.

If you want to compute f(\theta|\vec{x}) you must become a Bayesian. That's my favorite kind of statistics.
 
CantorSet said:
Hi everyone,

This is not a homework question but something I thought of while reading.

In the method of maximum likelihood estimation, they're trying to maximize the likelihood function
f(\vec{x}| \theta ) with respect to \theta. But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?

Generally that would be written: L(\vec{x}|\theta) = f(\theta|\vec{x}).

The expression f(\vec{x}| \theta) is just a conditional probability when f = P.
 
Last edited:
Thanks for the responses, guys.
 
I'm taking a look at intuitionistic propositional logic (IPL). Basically it exclude Double Negation Elimination (DNE) from the set of axiom schemas replacing it with Ex falso quodlibet: ⊥ → p for any proposition p (including both atomic and composite propositions). In IPL, for instance, the Law of Excluded Middle (LEM) p ∨ ¬p is no longer a theorem. My question: aside from the logic formal perspective, is IPL supposed to model/address some specific "kind of world" ? Thanks.
I was reading a Bachelor thesis on Peano Arithmetic (PA). PA has the following axioms (not including the induction schema): $$\begin{align} & (A1) ~~~~ \forall x \neg (x + 1 = 0) \nonumber \\ & (A2) ~~~~ \forall xy (x + 1 =y + 1 \to x = y) \nonumber \\ & (A3) ~~~~ \forall x (x + 0 = x) \nonumber \\ & (A4) ~~~~ \forall xy (x + (y +1) = (x + y ) + 1) \nonumber \\ & (A5) ~~~~ \forall x (x \cdot 0 = 0) \nonumber \\ & (A6) ~~~~ \forall xy (x \cdot (y + 1) = (x \cdot y) + x) \nonumber...
Back
Top