# Multiple variable prediction interval

1. Jun 24, 2014

### Uniquebum

Hey!

I'm working with some regression related stuff at the moment and i'd need some help with multiple variable prediction interval. Prediction interval for a single variable can be calculated using

$$PI = \hat{\beta_0}+\hat{\beta_1}x_i \pm t^* s_e \sqrt{1+\frac{1}{n} + \frac{(x_i-mean(x))^2}{S_{xx}}}$$

where x can be thought as a 1 dimensional vector (or matrix/set) which holds the values x_0, x_1, x_2 and so on. Also, $\hat{\beta_0}+\hat{\beta_1}x_i$ is a linear regression line $\hat{y}$. Finally, $t^*$ is the t-percentile, $s_e$ is standard deviation, $n$ is the amount of points in the sample and $S_{xx} = \sum{(x_i-mean(x))^2}$ from 1 --> n.

Now what does the equation look like for multiple variable regression?

I'd suppose $\hat{\beta_0}+\hat{\beta_1}x_i$ is easily changed to
$$\hat{\beta_0}+\hat{\beta_1}x_{0i}+\hat{\beta_2}x_{1i}+\hat{\beta_3}x_{2i}+...$$
but what do i do with
$$\frac{(x_i-mean(x))^2}{S_{xx}}$$
?

2. Jul 1, 2014

### Greg Bernhardt

I'm sorry you are not generating any responses at the moment. Is there any additional information you can share with us? Any new findings?

3. Jul 5, 2014

### FactChecker

Off the top of my head, I would say that $s_e$ would be replaced by a cross-covariance matrix of the $x_{j}$s and that the square root would be replaced by a vector where each element is calculated with the square root equation.

PS. Your equations should drop the i subscript where x is now an arbitrary input rather than the sample data point i.

PPS. I don't know which sign of the square root to pick. I think that an authoritative answer to your OP will take more expertise than I have.

Last edited: Jul 5, 2014
4. Jul 6, 2014

You'll find formulae if you look in a book on multiple regression, linear models, or basic multivariate analysis. Essentially you replace the quantity you ask about with the matrix equivalent. If $\widehat y$ is the fitted value from the equation, and $\mathbf{x}_0$ is the specified value of the predictor, the interval estimate for the mean value of the response is

$$\widehat y \pm t \sqrt{\, \hat{\sigma}^2 \mathbf{x}'_0 \left(X' X\right)^{-1} \mathbf{x}_0 }$$

If you want the interval for the particular value it is

$$\widehat y \pm t \sqrt{\, \hat{\sigma}^2 \left(1 + \mathbf{x}'_0 \left(X' X\right)^{-1} \mathbf{x}_0 \right) }$$

5. Jul 7, 2014

### Uniquebum

Thanks alot for the replies. I looked through a couple of books but they only talked about multiple variable regression in too vague manner. This'll help me get forward. Thanks again.