How to propagate errors through a regression & non-linear model?

In summary, this person was working on a model where they were trying to measure uncertainty and found that it was complicated to do so in a straightforward way.
  • #1
Master1022
611
117
TL;DR Summary
How to calculate uncertainty bounds for the output of a linear regression model
Hi,

I was working on a predictive linear regression model and was hoping to obtain some bounds to represent the uncertainty present in the model.

Question:
I suppose this boils down into two separate components:
1. What is a good measure of uncertainty from a linear regression model? MSE, or perhaps another metric?
2. How can I propagate that metric through a non-linear function?

Context I have used a certain dataset in 3 different linear regression models to predict variables ## x_1 ##, ## x_2 ##, and ## x_3 ##. I know the mean squared errors for those predictions - each of the predictions uses the same input variables, but the regression weights are different. Then, I am calculating ## y = f(x_1, x_2, x_3) ## where ## f ## is a non-linear function (not complicated, but there are products of ## x_1 \cdot x_2 ## and ## x_1 \cdot x_3 ##). How can I calculate a metric that is measures 'uncertainty' such that I can give the output of the model as ## y \pm \Delta y ##?

- ## y ## is being forecast into the future and so I do not have access to data to compare it against and calculate a MSE/metric

Thanks in advance for any help.
 
Mathematics news on Phys.org
  • #2
Master1022 said:
How can I calculate a metric that is measures 'uncertainty' such that I can give the output of the model as ## y \pm \Delta y ##?

Suppose we take the simple interpretation that the "uncertainty" of ##y## will be ##\sigma_y## , the standard deviation of ##y## when ##y## is considered as a random variable.

If you have a specific probability model, where each random variable involved has (or is assumed to have) a distribution with known parameters then we can discuss calculating the standard deviation of ##y##.

However, if we only assume the random variables involved are from a general family of distributions ( for example, if some have unknown means and variances) then saying we will calculate the standard deviation of ##y## is misleading. A better choice of words is to say that we will estimate the standard deviation of ##y##. The thing we can calculate is an estimator of the standard deviation of ##y##.

What "uncertainty" means in the latter situation is somewhat compliated because we can make an estimate of ##\sigma_y## as a function ##\hat{\sigma}_y## of the data, but in common language terms, there is some uncertainty in our estimate.

Which of the two situations applies to your problem?
 
  • Like
Likes Master1022
  • #3
Master1022 said:
I have used a certain dataset in 3 different linear regression models to predict variables x1, x2, and x3. I know the mean squared errors for those predictions - each of the predictions uses the same input variables, but the regression weights are different.
It would seem unlikely to me that these errors are uncorrelated. Can you explain what you did ?
 
  • #4
Stephen Tashi said:
Suppose we take the simple interpretation that the "uncertainty" of ##y## will be ##\sigma_y## , the standard deviation of ##y## when ##y## is considered as a random variable.

If you have a specific probability model, where each random variable involved has (or is assumed to have) a distribution with known parameters then we can discuss calculating the standard deviation of ##y##.

However, if we only assume the random variables involved are from a general family of distributions ( for example, if some have unknown means and variances) then saying we will calculate the standard deviation of ##y## is misleading. A better choice of words is to say that we will estimate the standard deviation of ##y##. The thing we can calculate is an estimator of the standard deviation of ##y##.

What "uncertainty" means in the latter situation is somewhat compliated because we can make an estimate of ##\sigma_y## as a function ##\hat{\sigma}_y## of the data, but in common language terms, there is some uncertainty in our estimate.

Which of the two situations applies to your problem?

Thanks for your response. So my situation is the latter, so it looks like an estimate is what we are aiming for to measure the uncertainty. How would I go about propagating that through a non-linear function?
 
  • #5
BvU said:
It would seem unlikely to me that these errors are uncorrelated. Can you explain what you did ?
Thanks for your response @BvU ! So I basically used those all three variables ## x_1 ##, ## x_2 ##, ## x_3 ## to calculate three predicted variables ## a ##, ## b ##, ## c ##. Then the output ## y ## had a form that can be condensed to:
[tex] y = c \cdot (b - a) [/tex]
You are right that the errors for ## a ##, ##b##, and ##c## are likely not independent as they all were predictions using the same three variables (## x_1 ##, ## x_2 ##, ## x_3 ##). What is the best way to deal with such a situation in order to get an error estimate for ##y##?
 
  • #6
Master1022 said:
I basically used those all three variables ## x_1 ##, ## x_2 ##, ## x_3 ## to calculate three predicted variables ## a ##, ## b ##, ## c ##. Then the output ## y ## had a form that can be condensed to:
[tex] y = c \cdot (b - a) [/tex]
You are right that the errors for ## a ##, ##b##, and ##c## are likely not independent as they all were predictions using the same three variables (## x_1 ##, ## x_2 ##, ## x_3 ##). What is the best way to deal with such a situation in order to get an error estimate for ##y##?
From what you 'did basically' I can't follow what you did, so all I can give is general advice: find out the correlation matrix and use it to propagate the errors.
 
  • #7
BvU said:
From what you 'did basically' I can't follow what you did, so all I can give is general advice: find out the correlation matrix and use it to propagate the errors.
Apologies, which part wasn't clear? I can try to explain further.

The same three time series ## x_1 ##, ## x_2 ##, and ## x_3 ## were used as inputs to three different linear regression models. The outputs of these models were ## a ##, ## b ##, and ## c ##. Then these constants were combined in a formula which was of the form ## c \cdot (b - a) ##. Which part was unclear?
 
  • #8
Master1022 said:
How would I go about propagating that through a non-linear function?

"Propagating error through a function ##f(x_1,x_2,x_3)##" is usually defined to mean estimating the standard deviation of the random variable ##y = f(x_1,x_2,x_3)##. Although estimating a standard deviation is a different concept that computing the standard deviation from known parameters of distributions, what is usually done is to make a lot of assumptions that justify using the sample values of parameters (such as mean, variance, covariance, etc.) as if they were they were the true values of the parameters. So, we end up in case 1 of post #2, even if are really in case 2 !

Proceeding as if we know the true values of all parameters involved, write a Taylor series (multinomial) approximation of ##f()## expanded about the point ##(x_1 - \overline{x_1}, x_2 - \overline{x_2}, x_3 - \overline{x_3})## where ##\overline{ x_k} ## is the mean of ##x_k##. (Use the values of the sample means as the values of the actual means.)

Truncate the expansion. Then compute the standard deviation of ##y## by doing the appropriate integration of the multinomial approximation.

Of course, doing the integration can be complicated, but it's possible in principle since the calculations only involve computing "moments" of multinomial functions. For example, to compute ##\overline{y}## we might have to find the expected value of a term like ##\frac{\partial^2 f}{ \partial {x_1}^2} \frac{\partial f}{\partial x_2} ((x_1 - \overline{x_1})^2( x_2 - \overline{x_2}) ##. The values of the partial derivatives are known because we are evaluating them at ##(\overline{x_1}, \overline{x_2},\overline{x_3})##. For the expected value of ##( x_1- \overline{x_1})^2 (x_2 - \overline{x_2} )##, we use the sample mean of the quantity ##(x_1- \overline{x_1})^2 (x_2 - \overline{x_2})##.

That's an outline of common practice. The mathematics of how well this way of estimating ##\sigma_y## works is a different matter.

For typical distributions, using sample values to estimate "higher moments" like ##\overline{ {x_1}^2 x_2 {x_3}^2} ## performs worse (in an appropriate technical sense) than using sample values to estimate lower moments like ##\overline{x_1}## or ## \overline{x_1 x_2} ## So including a lot of terms in the Taylor series does not necessarily make the estimate of ##\sigma_y## more reliable. The more terms you include, the more higher moments are involved, so the assumption that we can use sample moments as the actual higher moments becomes questionable.
 
  • #9
The "goodness of fit" of a regression is usually measured by the "coefficient of correlation". This coefficient is defined as [itex] r^{2}=\frac{\sum (Y_{est}-\bar{Y})}{\sum(Y-\bar{Y})}[/itex] where Y denotes the observed valued, Yest are the values you get from your regression model and [itex] \bar{Y}[/itex] is the mean value of the Ys. r2 varies between 0 and 1, where 1 denotes a perfect correlation and 0 denotes no correlation.
 

1. How do I calculate the standard error of regression coefficients in a linear model?

In a linear model, the standard error of regression coefficients can be calculated by taking the square root of the mean squared error (MSE) divided by the sum of squared differences between the predicted values and the actual values.

2. What is the difference between standard error and standard deviation in a regression model?

The standard error in a regression model measures the accuracy of the regression coefficients, while the standard deviation measures the variability of the data points around the mean. Standard error is used to assess the precision of the regression model, while standard deviation is used to assess the spread of the data.

3. How do I propagate errors through a non-linear regression model?

In a non-linear regression model, the standard error of the regression coefficients can be calculated using the delta method. This involves taking the partial derivatives of the model with respect to each parameter and then plugging in the estimated values to calculate the standard error.

4. What is the importance of calculating the standard error in a regression model?

The standard error is important in a regression model because it allows us to assess the precision and reliability of the estimated coefficients. A smaller standard error indicates a more precise and accurate model, while a larger standard error indicates more uncertainty in the estimated coefficients.

5. Can the standard error be used to determine the significance of a regression coefficient?

Yes, the standard error can be used to calculate the t-statistic, which is then used to calculate the p-value. The p-value represents the probability of obtaining a coefficient as extreme as the one observed if the null hypothesis (that the coefficient is equal to zero) is true. A lower p-value indicates a more significant coefficient.

Similar threads

Replies
3
Views
739
  • General Math
Replies
2
Views
167
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
846
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • General Math
Replies
5
Views
1K
  • Programming and Computer Science
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
23
Views
2K
  • STEM Educators and Teaching
Replies
11
Views
2K
Back
Top