Weighted least squares residuals

In summary, the conversation discusses the use of residuals in a weighted least squares fit of a model to data. The question arises whether the normalized or usual definition of residuals should be used, taking into account the statistical leverages and residual standard deviation. It is suggested to use the adjusted residual with the bisquare weight method. It is also advised to accurately state the problem and provide a link for better guidance.
  • #1
vibe3
46
1
Hello,

When doing a weighted least squares fit of a model to data, I want to examine the residuals to see if their histogram matches the expected probability distribution. Since I am minimizing
[tex]
\chi^2 = \sum_i w_i \left[ y_i - Y(x_i) \right]^2
[/tex]
would I define my (normalized/studentized) residuals as
[tex]
r_i = \frac{\sqrt{w_i} ( y_i - Y(x_i) )}{\sigma \sqrt{1 - h_i}}
[/tex]
or in the usual way as
[tex]
r_i = \frac{y_i - Y(x_i)}{\sigma \sqrt{1 - h_i}}
[/tex]
where [itex]h_i[/itex] are the statistical leverages and [itex]\sigma[/itex] is the residual standard deviation. I am using a robust (iterative) least squares procedure to determine the weights [itex]w_i[/itex] in order to detect and eliminate outliers, so the residual histogram should match the expected distribution of the M-estimator function I'm using (Huber in my case).
 
Physics news on Phys.org
  • #2
I don't know what a "statistical leverage" is.

If the weights are just something used in the fitting process and are not parameters of the predictive model then the weights are not involved in the calculation of an individual "residual" of the model, by the usual definition of "residual". You'll get better advice if you state the problem precisely. If you are using an example of M-estimation as a guide, can you give a link to it?
 
  • #3
It looks like to me that you're using a bisquare weight method. You will want to use the normal definition of the residual and with adjusted residual defined as [tex]r_{adj} = \frac{r_{i}}{\sqrt{1-h_i}}[/tex]. Where [tex] r_{i} [/tex] is your residual as defined normally and [tex]h_{i}[/tex] is your leverage. Matlab or SAS will produce this as your weighted least squares as the default.
 

What is the purpose of weighted least squares residuals?

The purpose of weighted least squares residuals is to account for heteroscedasticity, or unequal variance, in a dataset. In traditional least squares regression, all data points are assumed to have equal variance. However, this is often not the case in real-world data. Weighted least squares allows for the weighting of data points based on their variance, giving more weight to those with lower variance and less weight to those with higher variance.

How do you calculate weighted least squares residuals?

Weighted least squares residuals are calculated by multiplying the squared residuals from a traditional least squares regression by the corresponding weights. The weights are typically based on the inverse of the variance of each data point. The resulting weighted residuals are then used in the calculation of the regression coefficients and other statistics.

When should weighted least squares residuals be used?

Weighted least squares residuals should be used when there is evidence of heteroscedasticity in the data. This may be seen in a plot of the residuals, where the spread of the points appears to increase or decrease as the predicted values increase. Additionally, the presence of outliers or influential data points may also warrant the use of weighted least squares.

What are the advantages of using weighted least squares residuals?

The use of weighted least squares residuals can improve the accuracy and precision of regression estimates by accounting for unequal variance in the data. This can lead to more reliable and valid statistical inferences. Additionally, weighted least squares can help to reduce the impact of outliers and influential data points on the regression results.

Are there any potential drawbacks to using weighted least squares residuals?

One potential drawback to using weighted least squares residuals is that it requires the assumption of a specific weighting scheme, which may not always be known or accurate. Additionally, weighted least squares may not be appropriate for all types of data or models. It is important to carefully consider the data and research question before deciding to use weighted least squares.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
923
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
713
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
7K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
10K
Back
Top