# Comparing Least Squares and Maximum Likelihood?

1. Oct 18, 2012

### peripatein

Hi,
Below is my attempt at a comparison between the two above-mentioned methods of estimation. Does anything in the table lack in validity and/or accuracy? Should any properties, advantages/disadvantages be eked out? Any suggestions/comments would be most appreciated!

MLE:

(1) Very accurate for a large N as the pdf of a^ would be unbiased

(2) No loss of information; all data are represented

(3) Quite complicated to solve mathematically

(4) Applicable for varied models, even non-linear

(5) Errors of estimation could be readily found: the 1sigma
error bars are those at which the logarithm falls by 0.5 from
its maximum

(6) Pdf must be known in advance

(7) In case pdf is false, goodness of fit may not be determined

LSE:

(1) Very accurate for a relatively small N as estimators would be biased

(2) -

(3) Finding the suitable linear model is quite simple mathematically

(4) Very convenient to use for linear models; very intricate for
non-linear ones

(5) -

(6) Variance and mean must be known in advance

(7) Method is very sensitive to unusual data values but
goodness of fit may be determined, through chi-squared
test e.g.

Last edited: Oct 18, 2012
2. Oct 18, 2012

### haruspex

In my view, one should always ask how the conclusions will be used, i.e. what is the cost function for the decision error? E.g. if you're trying to estimate a mean, and the cost of getting it wrong varies as the square of the error, I suspect least squares is going to be the ideal (but that's an off-the-cuff guess, so don't quote me). But if you need to be within some tight range of accuracy and anything beyond that is simply a miss then MLE may be more appropriate.

3. Oct 19, 2012

### chiro

Hey peripatein.

One of the best properties of the MLE is that it can be used with the invariance principle.

This leads you to get estimates of functions of a parameter and it's useful for example when you use the Wald test-statistic with the parameter to get a standard error term as a function of your estimate of the proportion which gives you the Normal approximation.

Other estimators aren't guaranteed to have this property and it is a nice property.

Both methods make assumptions about the underlying distribution in some form (the non-additive linear models) so I don't think you can really use this as a way to differentiate the two approaches.

You can always transform data with the Linear Models approach and link functions provide a way of systematizing this.

Also you can also estimate quantities in the LSE formulation just like you can do with the MLE approach.

4. Oct 19, 2012

### peripatein

Thank you for your replies! So much appreciated :-).
However, are there any cardinal inaccuracies? Anything of true significance which ought to be added and was left out?

5. Oct 19, 2012

### chiro

What do you mean?

6. Oct 19, 2012

### peripatein

I mean, are the properties for each method, as presented in my initial post, accurate? Are any additional, significant properties missing? What may I write for numbers 2 and 5 under LSE (i.e. what will be the equivalent LSE properties for numbers 2 and 5 under MLE)?
Again, any comments whatsoever would be highly appreciated!

7. Oct 19, 2012

### Stephen Tashi

Most of your statements have no precise mathematical interpretation. You haven't defined what phrases like "very accurate" and "errors of estimation" mean. Can you rephrase your statements using the standard terminology of mathematical statistics? If not, I think you are asking for "rules of thumb" which are empirical or subjective.