What is the difference between random error and residual?

Click For Summary
SUMMARY

The discussion clarifies the distinction between random error (εi) and residual (ei) in the context of simple linear regression. In the regression model Yi = β0 + β1Xi + εi, β0 and β1 represent the true parameters, while b0 and b1 are their estimated counterparts derived from sample data. Random error (εi) encompasses unobserved factors affecting the dependent variable, while residual (ei) quantifies the difference between observed values and predicted values (Yi - Yi hat). Understanding these concepts is crucial for effective regression analysis and model evaluation.

PREREQUISITES
  • Understanding of simple linear regression models
  • Familiarity with statistical notation (Greek and Latin letters)
  • Knowledge of fitted values and residuals
  • Basic concepts of error analysis in statistics
NEXT STEPS
  • Study the implications of random error in regression analysis
  • Learn about minimizing sum of squared residuals in model fitting
  • Explore the assumptions of linear regression, including independence of errors
  • Investigate advanced regression techniques and their error metrics
USEFUL FOR

Statisticians, data analysts, and researchers involved in regression modeling and analysis will benefit from this discussion, particularly those seeking to deepen their understanding of error components in statistical models.

kingwinner
Messages
1,266
Reaction score
0
1) "Simple linear regression model: Yi = β0 + β1Xi + εi , i=1,...,n where n is the number of data points, εi is random error
We want to estimate β0 and β1 based on our observed data. The estimates of β0 and β1 are denoted by b0 and b1, respectively."


I don't understand the difference between β01 and b0,b1.
For example, when we see a scattered plot with a least-square line of best fit, say, y = 8 + 5x, then βo=8, β1=5, right? What are the b0 and b1 all about? Why do we need to introduce b0,b1?


2) "Simple linear regression model: Yi = β0 + β1Xi + εi , i=1,...,n where n is the number of data points, εi is random error
Fitted value of Yi for each Xi is: Yi hat = b0 + b1Xi
Residual = vertical deviations = Yi - Yi hat = ei
where Yi is the actual observed value of Y, and Yi hat is the value of Y predicted by the model"


Now I don't understand the difference between random error (εi) and residual (ei). What is the meaning of εi? How are εi and ei different?

Thanks for explaining!
 
Last edited:
Physics news on Phys.org
The greek letters are for the true value of each parameter. The latin letters are for the estimated values. The values or the former do not depend on your sample. The values of the latter are sample-specific.
 
To answer the question on error and observables...

ε in this regression model refers to the unobserved error of the data. In the true model, it represents all the factors that the experimentor cannot see or account for. In many models, it is assumed ε is uncorrelated with X and that E[ε]=0.


A residual (e) is something totally different. It is simply as you defined it: Ybar - Yhat. It is the difference between the expected value of the dependent variable and the observed value. e is fundamentally about what you are trying to solve in the regression. Minimizing the sum of squared residuals.
 

Similar threads

  • · Replies 30 ·
2
Replies
30
Views
4K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 34 ·
2
Replies
34
Views
29K
Replies
3
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K