Optimizing H for Accurate Outlier Detection in Weighted Linear Least Squares

  • Context: Graduate 
  • Thread starter Thread starter ilia1987
  • Start date Start date
  • Tags Tags
    Least squares Squares
Click For Summary
SUMMARY

This discussion focuses on optimizing the matrix H for accurate outlier detection in weighted linear least squares (WLS) regression. The key equation presented is the estimation of variables $\mathbf{x}$ using the formula $\mathbf{x} = (H^TWH)^{-1}H^TW\mathbf{\phi}$, where W is a diagonal matrix. The author emphasizes the need for the off-diagonal elements of the matrix $H (H^TWH)^{-1}H^TW$ to be smaller than the diagonal elements to effectively identify stray measurements. A correction is made to clarify that the absolute values of the diagonal elements must exceed those of the off-diagonal elements to ensure accurate residuals.

PREREQUISITES
  • Understanding of weighted linear least squares (WLS) regression
  • Familiarity with matrix operations and properties
  • Knowledge of residual analysis in statistical modeling
  • Experience with linear algebra concepts, particularly in relation to diagonal and off-diagonal matrix elements
NEXT STEPS
  • Research the properties of diagonal dominance in matrices
  • Explore advanced techniques in outlier detection within regression analysis
  • Study the implications of residuals in weighted linear least squares
  • Learn about matrix factorization methods that can enhance model robustness
USEFUL FOR

Data scientists, statisticians, and researchers involved in regression analysis and outlier detection will benefit from this discussion, particularly those working with weighted linear least squares methods.

ilia1987
Messages
9
Reaction score
0
I have the following problem:
I have a set of [tex]m[/tex] measurements [tex]$\mathbf{\phi}$[/tex]
and I estimate a set of 3 variables [tex]$\mathbf{x}$[/tex]

The estimated value for [tex]$\mathbf{\phi}$[/tex] depends linearly on [tex]$\mathbf{x}$[/tex] : [tex]Hx=\Tilde{\phi}[/tex]
The solution through weighted linear least squares is:
[tex]$\mathbf{x}$ = (H^TWH)^{-1}H^TW$\mathbf{\phi}$[/tex]
W is diagonal matrix

Suppose there is absolutely no deviation between the estimated values of [tex]$\mathbf{\phi}$[/tex] and the measured values (perfect measurements).
Now, suppose that one measurement [tex]\phi_i[/tex] deviates strongly from its nominal value. In such a case the estimated value of [tex]$\mathbf{x}$[/tex] is going to change and so will the estimated value [tex]$\mathbf{\Tilde{\phi}}$[/tex]. if the actual measurements deviate by [tex]$\mathbf{\delta\phi}$[/tex], the estimated values are goind to deviate by:
[tex]$\mathbf{\Delta\Tilde{\phi}}=H (H^TWH)^{-1}H^TW$\mathbf{\delta\phi}$[/tex]

I later use the difference [tex]\Tilde{\phi}-\phi[/tex] to reject stray measurements.
So, a situation where one stray measurement [tex]\delta\phi_i[/tex] causes a large deviation [tex]\Delta\Tilde{\phi_j}[/tex] is undesirable.
I reached the conclusion that it is necessary for off diagonal elements of
[tex]H (H^TWH)^{-1}H^TW[/tex] to be smaller than the diagonal elements in order for the correct measurement to be recognized as stray.

My hope is that someone on this forum can help me find a nicer, more analytical way to express that condition on [tex]H[/tex] itself rather that on this monstrous expression, or at least point me in the right direction.

Thank you.
 
Last edited:
Physics news on Phys.org
A small correction:

For the matrix
[tex]H (H^TWH)^{-1}H^TW - I_{m \times m}[/tex]
I want the absolute value of the diagonal elements to be larger than the absolute values of the off-diagonal elements.

That's because I want:
[tex]\forall(j\neq i) \delta\phi_i > \delta\phi_j \Rightarrow r_i > r_j[/tex]
where [tex]r_k = \tilde{\phi_k} - \phi_k[/tex] are the residuals.
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
1K
  • · Replies 4 ·
Replies
4
Views
1K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 28 ·
Replies
28
Views
7K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
Replies
24
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K