Optimizing H for Accurate Outlier Detection in Weighted Linear Least Squares

In summary, the conversation discusses a problem where the estimated values for a set of measurements, denoted by $\mathbf{\phi}$, are linearly dependent on another set of variables, denoted by $\mathbf{x}$. The solution for this problem using weighted linear least squares is given by $\mathbf{x} = (H^TWH)^{-1}H^TW\mathbf{\phi}$, where W is a diagonal matrix. In the case of perfect measurements, there is no deviation. However, if one measurement $\phi_i$ deviates strongly, it affects the estimated values of both $\mathbf{x}$ and $\mathbf{\Tilde{\phi}}$. To reject stray measurements, it is desirable for
  • #1
ilia1987
9
0
I have the following problem:
I have a set of [tex]m[/tex] measurements [tex]$\mathbf{\phi}$[/tex]
and I estimate a set of 3 variables [tex]$\mathbf{x}$[/tex]

The estimated value for [tex]$\mathbf{\phi}$[/tex] depends linearly on [tex]$\mathbf{x}$[/tex] : [tex]Hx=\Tilde{\phi}[/tex]
The solution through weighted linear least squares is:
[tex]$\mathbf{x}$ = (H^TWH)^{-1}H^TW$\mathbf{\phi}$[/tex]
W is diagonal matrix

Suppose there is absolutely no deviation between the estimated values of [tex]$\mathbf{\phi}$[/tex] and the measured values (perfect measurements).
Now, suppose that one measurement [tex]\phi_i[/tex] deviates strongly from its nominal value. In such a case the estimated value of [tex]$\mathbf{x}$[/tex] is going to change and so will the estimated value [tex]$\mathbf{\Tilde{\phi}}$[/tex]. if the actual measurements deviate by [tex]$\mathbf{\delta\phi}$[/tex], the estimated values are goind to deviate by:
[tex]$\mathbf{\Delta\Tilde{\phi}}=H (H^TWH)^{-1}H^TW$\mathbf{\delta\phi}$[/tex]

I later use the difference [tex]\Tilde{\phi}-\phi[/tex] to reject stray measurements.
So, a situation where one stray measurement [tex]\delta\phi_i[/tex] causes a large deviation [tex]\Delta\Tilde{\phi_j}[/tex] is undesirable.
I reached the conclusion that it is necessary for off diagonal elements of
[tex]H (H^TWH)^{-1}H^TW[/tex] to be smaller than the diagonal elements in order for the correct measurement to be recognized as stray.

My hope is that someone on this forum can help me find a nicer, more analytical way to express that condition on [tex]H[/tex] itself rather that on this monstrous expression, or at least point me in the right direction.

Thank you.
 
Last edited:
Physics news on Phys.org
  • #2
A small correction:

For the matrix
[tex]H (H^TWH)^{-1}H^TW - I_{m \times m}[/tex]
I want the absolute value of the diagonal elements to be larger than the absolute values of the off-diagonal elements.

That's because I want:
[tex]\forall(j\neq i) \delta\phi_i > \delta\phi_j \Rightarrow r_i > r_j[/tex]
where [tex]r_k = \tilde{\phi_k} - \phi_k[/tex] are the residuals.
 

1. What are least squares outliers?

Least squares outliers are data points that do not fit the trend or pattern of a data set when using the least squares regression method. They are values that deviate significantly from the expected values and can have a large impact on the overall regression line.

2. How are least squares outliers identified?

Least squares outliers are typically identified by calculating the residuals, which are the differences between the actual values and the predicted values from the regression line. Outliers can be identified by looking at the magnitude of the residuals and comparing them to a threshold value.

3. What causes least squares outliers?

There are several factors that can cause least squares outliers, including measurement errors, data entry errors, incorrect assumptions about the underlying relationship between variables, or the presence of influential data points that have a disproportionate impact on the regression line.

4. How do least squares outliers affect the regression line?

Least squares outliers can have a significant impact on the regression line, as they can pull the line towards or away from the outliers, resulting in a less accurate fit. This can also lead to an underestimation or overestimation of the relationship between variables.

5. How can least squares outliers be dealt with?

There are several ways to deal with least squares outliers, such as removing the outliers from the data set, transforming the data to reduce the impact of the outliers, or using robust regression methods that are less influenced by outliers. It is important to carefully consider the cause of the outliers and the impact of removing them before making any adjustments to the data.

Similar threads

Replies
2
Views
933
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
975
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Math Proof Training and Practice
Replies
28
Views
5K
  • Set Theory, Logic, Probability, Statistics
Replies
24
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Linear and Abstract Algebra
Replies
9
Views
2K
  • Engineering and Comp Sci Homework Help
Replies
8
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Calculus and Beyond Homework Help
Replies
3
Views
2K
Back
Top