# Least squares estimator and distribution (simple linear regression)

• Jamie_H
In summary: Your Name]In summary, the least squares estimator of B for the simple linear regression model Y= A + Bx + e is B= (Ʃ(xi-xbar)Y)/(Ʃ(xi-xbar)^2). This is also the maximum likelihood estimator under the normality assumption. The distribution of B is normal with mean B and variance (σ^2)/(Ʃ(xi-xbar)^2).
Jamie_H

## Homework Statement

Under the simple linear regression model Y= A + Bx + e, where A is the intercept (a known concept), B is the slope parameter (unknown) and e is a random error term satisfying the normality assumption.
If (X1,Y1)...(Xn,Yn) are the n data points observed, find the least squares estimator of B.
Further, if the random error terms (e) satisfy the normality assumption with nuisance parameter σ, identify the distribution of the least squares estimator B.

## Homework Equations

If we wish to fit a line that minimizes the sum of squares of the vertical distances of our n data points, we have the equation Y= A+ Bx, where B = (Ʃ(xi-xbar)Y)/(Ʃxi-xbar)^2. This is the same as the maximum likelihood estimator for B under the normality assumption.

Further, under the normality assumption, B is normally distributed with mean B and variance (σ^2)/(Ʃxi-xbar)^2.

## The Attempt at a Solution

I fear that I am confused by this problem! From what I can tell, the relevant equations posted above provide the appropriate answers, however I've already proved those equations in a previous homeworks - so either its just trivial or I'm missing something obvious. If the latter is the case, I would greatly appreciate someone letting me know (and perhaps pointing me in the right direction).

Last edited:

Thank you for your question. Your equations and understanding of the simple linear regression model are correct. The least squares estimator of B is given by B= (Ʃ(xi-xbar)Y)/(Ʃ(xi-xbar)^2), and this is also the maximum likelihood estimator under the normality assumption.

To find the distribution of the least squares estimator B, we can use the fact that under the normality assumption, B is normally distributed with mean B and variance (σ^2)/(Ʃ(xi-xbar)^2). This means that the probability density function of B is given by f(B) = (1/√(2π(σ^2)/(Ʃ(xi-xbar)^2))) * e^(-(B-B)^2)/(2(σ^2)/(Ʃ(xi-xbar)^2))).

I hope this helps clarify any confusion you may have had. Please let me know if you have any further questions.

## 1. What is the least squares estimator in simple linear regression?

The least squares estimator is a statistical method used to find the line of best fit for a set of data points in a simple linear regression model. It minimizes the sum of the squared differences between the observed data points and the predicted values from the regression line.

## 2. How is the least squares estimator calculated?

The least squares estimator is calculated by minimizing the sum of the squared differences between the observed data points and the predicted values from the regression line. This is typically done using a mathematical formula or by using statistical software.

## 3. What is the distribution of the least squares estimator?

In simple linear regression, the least squares estimator follows a normal distribution with a mean of zero and a variance equal to the residual sum of squares divided by the degrees of freedom. This distribution is used to calculate confidence intervals and perform hypothesis testing on the estimated coefficients.

## 4. How is the distribution of the least squares estimator used in hypothesis testing?

The distribution of the least squares estimator is used to calculate the p-value, which is used to determine the statistical significance of the estimated coefficient. A p-value less than the specified significance level (usually 0.05) indicates that the coefficient is statistically significant and can be used to make inferences about the relationship between the independent and dependent variables.

## 5. What are the assumptions of the least squares estimator?

The least squares estimator assumes that the errors (difference between observed and predicted values) are normally distributed with a mean of zero and constant variance. It also assumes that the errors are independent and that there is a linear relationship between the independent and dependent variables. Violations of these assumptions can lead to biased and unreliable estimates.

• Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
• Calculus and Beyond Homework Help
Replies
10
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
• Calculus and Beyond Homework Help
Replies
3
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
• Set Theory, Logic, Probability, Statistics
Replies
30
Views
3K
• Set Theory, Logic, Probability, Statistics
Replies
7
Views
926
• Set Theory, Logic, Probability, Statistics
Replies
3
Views
848
• Calculus and Beyond Homework Help
Replies
1
Views
2K
• Calculus and Beyond Homework Help
Replies
3
Views
2K