# Least squares line - understanding formulas

• I
• Vital
In summary, the conversation discusses a lecture on correlation and regression, specifically the formulas for calculating slope and y-intercept in a simple linear relationship. The speaker expresses difficulty in grasping the formulas intuitively and asks for help understanding each part and its purpose. They also inquire about the use of the terms "margin of error" and "slope x" in the formulas. The expert suggests viewing the formulas as a whole rather than focusing on individual terms, and explains that the goal is to minimize error in the equation.
Vital
Hello.

I have listened to a great lecture, which gave helpful intuitive insight into correlation and regression (basic stuff). But there are formulas, which I cannot grasp intuitively and don't know their origin. To remember them I would like to understand what's happening in each part of the formula and why these mathematical combinations are used to get the desired result, i.e. I would like to understand both mathematically and intuitively what's happening in those formulas.
I will be grateful for your patience and your help.

The first one is for the slope, and the second - for y-intercept
(both formulas below are used for variables in a simple linear relationships formula
y = y-intercept + slope multiplied by x).

slope = [ n×Σxy - ΣxΣy ] / [ n×Σx^2 - (Σx)^2 ]

I have "whys" about each part of this formula:
numerator
- why we take the sum of xy
- why we then multiply that sum by n (the number of elements) and what is the meaning and role of the result
- why we subtract from the previous result the sum of x multiplied by the sum of y
denominator
- why we take the sum of x squared
- why we then multiply it by n
- why we take the sum of all x and then square the result
- why we subtract the first from the second
formula
- why we use [n×Σxy - ΣxΣy] for numerator and [n×Σx^2 - (Σx)^2 ] for the denominator,
how do they work together, and what is the intuition behind the process?

The second one is for the y intercept:
intercept = [ Σy/n ] - slope x [ Σx / n]

Same questions here.And finally what is more confusing is that [ Σx / n] is called a margin of error. Why is this called a margin of error if it looks as a formula for finding the average value of x, given n elements. Thank you.

The slope formula has been manipulated to be easier to calculate (one pass through the data rather than two). It is closely related to the Pearson correlation coefficient. You can see a fairly intuitive initial definition of the Pearson correlation coefficient which is then manipulated to be close to your slope formula here.

For the intercept, I don't know what slope x means. Whatever it is, I assume that the same sort of manipulations has been done as was done for the slope formula.

I have not heard the sample average called a "margin of error" before, so I can't help you there. The usual use of the term "margin of error" in statistics does not have that definition.

Last edited:
Vital
Qualitatively, the slope is the covariance / variance

[ n×Σx^2 - (Σx)^2 ] is the variance:

the covariance in the numerator is the same thing by XY instead of X2. If the correlation is perfect, covariance = variance and the slope is 1. If there is no correlation, then covariance is zero and so is the slope.

FactChecker
@Vital I think your approach here is not going to be fruitful. To my knowledge there is no “why” for the individual terms, there is only a “why” for the whole formula. The individual terms are only there because together they achieve the goal of the overall formula, they individually have no particular importance.

The purpose of the overall formula is to calculate the ##m## and ##b## that minimize the error from ##y=mx +b##. Specifically, we want to find ##m## and ##b## such that ##\frac{\partial}{\partial m}\Sigma r^2=0## and ##\frac{\partial}{\partial b}\Sigma r^2=0## where ##r## is the residual error ##r=y-(mx+b)##. All of those formulas you are looking into are just what you get when you solve these equations.

Vital, FactChecker and DaveE
Dale said:
@Vital I think your approach here is not going to be fruitful. To my knowledge there is no “why” for the individual terms, there is only a “why” for the whole formula.
I agree. It started with a very intuitive formula, but then got manipulated so that the parts are not intuitive. The reason for the manipulation was to make it a single-pass calculation through the data, which is easier than the original two-pass formula (a first pass through the data to get the average followed by a second pass to total all the deviations from that average).

Vital and Dale
Thank you very much for your answers and guidance.

## 1. What is the least squares line?

The least squares line is a line that represents the best fit for a set of data points. It is calculated by minimizing the sum of the squared distances between the line and each data point.

## 2. How is the least squares line calculated?

The least squares line is calculated by using the formula y = mx + b, where m is the slope of the line and b is the y-intercept. The values of m and b are determined by using the least squares method, which involves finding the values that minimize the sum of the squared distances between the line and each data point.

## 3. What is the purpose of the least squares line?

The purpose of the least squares line is to find a line that best represents the relationship between two variables in a set of data. It is often used in regression analysis to make predictions or to determine the strength of the relationship between the variables.

## 4. How is the least squares line used in statistics?

The least squares line is used in statistics to analyze the relationship between two variables and make predictions based on that relationship. It is commonly used in linear regression analysis to find the line of best fit for a set of data points.

## 5. Are there any limitations to using the least squares line?

Yes, there are limitations to using the least squares line. It assumes that the relationship between the variables is linear and that there are no outliers in the data. It also does not take into account any other factors that may affect the relationship between the variables.

### Similar threads

• Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
934
• Set Theory, Logic, Probability, Statistics
Replies
2
Views
898
• Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
64
Views
3K
• Calculus and Beyond Homework Help
Replies
3
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
940