Why is likelihood function defined as such?

  • Context: Graduate 
  • Thread starter Thread starter CantorSet
  • Start date Start date
  • Tags Tags
    Function Likelihood
Click For Summary

Discussion Overview

The discussion revolves around the definition and interpretation of the likelihood function in the context of maximum likelihood estimation. Participants explore the philosophical and statistical implications of defining the likelihood function as f(𝑥|θ) versus f(θ|𝑥), considering frequentist and Bayesian perspectives.

Discussion Character

  • Debate/contested

Main Points Raised

  • Some participants question why the likelihood function is defined as f(𝑥|θ) instead of f(θ|𝑥), arguing that since the data vector 𝑥 is given, it seems more logical to express the likelihood in terms of the unknown parameter θ.
  • One participant explains that in frequentist statistics, parameters are treated as having "definite but unknown values," and no prior distribution for parameters is assumed, which complicates the computation of f(θ|𝑥).
  • Another participant expresses a philosophical view that frequentist statistics may not align with how people intuitively think about probability, suggesting that it answers the question in a way that feels backward.
  • There is a mention that to compute f(θ|𝑥), one must adopt a Bayesian approach, which is favored by some participants.
  • One participant reiterates the conditional probability nature of the expression f(𝑥|θ) and suggests that it can be written as L(𝑥|θ) = f(θ|𝑥).

Areas of Agreement / Disagreement

Participants express differing views on the definition and interpretation of the likelihood function, with no consensus reached on the preferred approach or its implications.

Contextual Notes

Participants highlight the philosophical differences between frequentist and Bayesian statistics, noting that the lack of a prior distribution in frequentist methods affects the interpretation of likelihood.

CantorSet
Messages
44
Reaction score
0
Hi everyone,

This is not a homework question but something I thought of while reading.

In the method of maximum likelihood estimation, they're trying to maximize the likelihood function
f(\vec{x}| \theta ) with respect to \theta. But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?
 
Physics news on Phys.org
CantorSet said:
But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?

Philosophically, I think you're right. But in frequentist statistics ( the usual kind taught in introductory statistics courses) no probability distribution is ever assumed for the parameters of a distribution. When we don't know these parameters, we say they have "definite but unknown values". If you don't have a "prior" probability distribution for the parameters, you can't compute the "posterior" distribution f(\theta|\vec{x}).

My personal view of frequentist statistics is this: The commonsense person's outlook is "I have certain ideas. Given the data, what is the probability that my ideas are true?" Frequentist statistics answers "I assumed certain ideas and calculated the probability of the data. Based on that probability, I will stipluate some decisions." This evades the question! Things are done backwards in comparison to what people naturally want to know.

The authoritative terminology used ("statistical significance" "rejection" of a hypothesis, "confidence" intervals) makes many laymen think that they are getting information about the probability that some idea is true given the data. But if you look under the hood at what's happening, what your are getting is a quantification of the probability of the data based on the assumption that certain ideas are true.

I'm not trying to say that frequentist statistics isn't effective. But how to apply it requires empirical trial and error. People observe that certain methods work in certain types of situations.

If you want to compute f(\theta|\vec{x}) you must become a Bayesian. That's my favorite kind of statistics.
 
CantorSet said:
Hi everyone,

This is not a homework question but something I thought of while reading.

In the method of maximum likelihood estimation, they're trying to maximize the likelihood function
f(\vec{x}| \theta ) with respect to \theta. But shouldn't the likelihood function be defined as f(\theta| \vec{x} ) since we are GIVEN the data vector \vec{x} while \theta is the unknown parameter?

Generally that would be written: L(\vec{x}|\theta) = f(\theta|\vec{x}).

The expression f(\vec{x}| \theta) is just a conditional probability when f = P.
 
Last edited:
Thanks for the responses, guys.
 

Similar threads

  • · Replies 16 ·
Replies
16
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 19 ·
Replies
19
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 23 ·
Replies
23
Views
4K