Comparing Least Squares and Maximum Likelihood?

Click For Summary

Discussion Overview

The discussion focuses on comparing the Least Squares Estimation (LSE) and Maximum Likelihood Estimation (MLE) methods, examining their properties, advantages, and disadvantages. Participants explore the mathematical and practical implications of each method in statistical estimation.

Discussion Character

  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant outlines several properties of MLE, noting its accuracy for large sample sizes, the necessity of knowing the probability density function (pdf) in advance, and its complexity in mathematical resolution.
  • Another participant contrasts LSE, stating it is accurate for smaller sample sizes but biased, and highlights its simplicity in finding suitable linear models.
  • Some participants propose that the choice between MLE and LSE may depend on the cost function associated with decision errors, suggesting that different contexts may favor one method over the other.
  • One participant mentions the invariance principle as a beneficial property of MLE, allowing for estimates of functions of parameters, which may not apply to other estimators.
  • Concerns are raised about the lack of precise mathematical definitions in the initial claims, with a request for clearer terminology in describing the properties of both methods.

Areas of Agreement / Disagreement

Participants express varying opinions on the accuracy and applicability of the properties outlined for both MLE and LSE. There is no consensus on the completeness or correctness of the properties, and some participants challenge the clarity and precision of the initial claims.

Contextual Notes

Some statements lack precise mathematical interpretation, and terms such as "very accurate" and "errors of estimation" are not clearly defined. The discussion reflects a range of assumptions and interpretations regarding the methods.

peripatein
Messages
868
Reaction score
0
Hi,
Below is my attempt at a comparison between the two above-mentioned methods of estimation. Does anything in the table lack in validity and/or accuracy? Should any properties, advantages/disadvantages be eked out? Any suggestions/comments would be most appreciated!

MLE:

(1) Very accurate for a large N as the pdf of a^ would be unbiased

(2) No loss of information; all data are represented

(3) Quite complicated to solve mathematically

(4) Applicable for varied models, even non-linear

(5) Errors of estimation could be readily found: the 1sigma
error bars are those at which the logarithm falls by 0.5 from
its maximum

(6) Pdf must be known in advance

(7) In case pdf is false, goodness of fit may not be determined

LSE:

(1) Very accurate for a relatively small N as estimators would be biased

(2) -

(3) Finding the suitable linear model is quite simple mathematically

(4) Very convenient to use for linear models; very intricate for
non-linear ones

(5) -

(6) Variance and mean must be known in advance

(7) Method is very sensitive to unusual data values but
goodness of fit may be determined, through chi-squared
test e.g.
 
Last edited:
Physics news on Phys.org
In my view, one should always ask how the conclusions will be used, i.e. what is the cost function for the decision error? E.g. if you're trying to estimate a mean, and the cost of getting it wrong varies as the square of the error, I suspect least squares is going to be the ideal (but that's an off-the-cuff guess, so don't quote me). But if you need to be within some tight range of accuracy and anything beyond that is simply a miss then MLE may be more appropriate.
 
Hey peripatein.

One of the best properties of the MLE is that it can be used with the invariance principle.

This leads you to get estimates of functions of a parameter and it's useful for example when you use the Wald test-statistic with the parameter to get a standard error term as a function of your estimate of the proportion which gives you the Normal approximation.

Other estimators aren't guaranteed to have this property and it is a nice property.

Both methods make assumptions about the underlying distribution in some form (the non-additive linear models) so I don't think you can really use this as a way to differentiate the two approaches.

You can always transform data with the Linear Models approach and link functions provide a way of systematizing this.

Also you can also estimate quantities in the LSE formulation just like you can do with the MLE approach.
 
Thank you for your replies! So much appreciated :-).
However, are there any cardinal inaccuracies? Anything of true significance which ought to be added and was left out?
 
What do you mean?
 
I mean, are the properties for each method, as presented in my initial post, accurate? Are any additional, significant properties missing? What may I write for numbers 2 and 5 under LSE (i.e. what will be the equivalent LSE properties for numbers 2 and 5 under MLE)?
Again, any comments whatsoever would be highly appreciated!
 
peripatein said:
I mean, are the properties for each method, as presented in my initial post, accurate?

Most of your statements have no precise mathematical interpretation. You haven't defined what phrases like "very accurate" and "errors of estimation" mean. Can you rephrase your statements using the standard terminology of mathematical statistics? If not, I think you are asking for "rules of thumb" which are empirical or subjective.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 6 ·
Replies
6
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 14 ·
Replies
14
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 1 ·
Replies
1
Views
2K