We all know the least squares method to find the best fit line for a collection of random data. But I wonder if it is the best method. Suppose we have two random variables y and x that appear to have a linear relation of the type y = ax+b. What we want is, given the next type x signal to predict as close as possible the value of the y signal. The well known method tells us to use our experimental readings and minimize the variance functional - so the values of a and b are easily computed. That is we seek the minimum of F = sum of [ ( Yi - aXi - b) ^ 2 ] which works out easily - see your maths book. But what if I go for the minimum of F1 = sum of [ | Yi - aXi - b | ] instead ? This one has no analytical solution but that does n't matter because it is very easy to work it out using any crude numerical approach. So in general we get two different -or somewhat different- best fit lines. Which one is the best best ? After all if we go back to the standard normal distribution, the moment X^2 is the variance (sigma ^ 2) and the moment |x| is also proportional to the standard deviation (|x| moment = sigma x sqr ( 2 / pi ) ). In a real problem using one method and then the other, the results are not likely to be identical. In terms of probabilistic inference which method is better ? And I don't believe in calculus books, because maybe what they wanted was to have an analytical solution and print it !