Prediction Intervals (Critical Levels)

In summary: The reason for this is the difference in the formulas for CI and PI. The CI formula uses a square root of 1/n, while the PI formula uses a square root of 1+1/n. This difference causes the PI to have a wider range than the CI, making it more likely for outlier data points to fall within the prediction interval. This is why Sample A may have a narrower CI but a wider PI compared to Sample B.In summary, the formulas for Prediction Intervals and Confidence Intervals use a similar formula, but the PI formula includes an adjustment for the standard error of the estimate, which results in a wider range and a higher probability of outlier data points falling within the interval
  • #1
kimberley
14
0
Hello Enuma et als.

I have a few related questions regarding Prediction Intervals for outliers (individual data points). The basic formula that I use is: Mean +/- T(-1df)*sd*sqrt(1+1/n), where Mean is the sample mean, T is the Z-score from the Student's T Distribution Table minus 1 degree of freedom, and sd is the sample standard deviation. This formula can be found in many publications, and at Wikipedia's entry on "Prediction Intervals". The formula for Confidence Intervals is the same except for the final part of the formula which is sqrt(1/n) as opposed to sqrt(1+1/n).

QUESTIONS:

1. What is the purpose of the last part of these formulas (i.e. sqrt(1/n) for CI and sqrt(1+1/n) for PI? In other words, what do these things seek to adjust? For the CI, it would seem that it's bringing the Standard Error of the Mean into the formula (i.e., sd*sqrt(1/n). But what about the PI--why is it 1+1/n? Is this bringing the Standard Error of the Estimate into the formula. I don't get it. Is it adjusting the sd forward or something?

2. If you wanted to use the Prediction Interval and Confidence Interval formulas and project out five days, would you change the degrees of freedom for the Z-score to -5, and what about the sqrt(1/n) and sqrt(1+1/n). Would you change them to 5/n and 5+5/n? As I state above, I just don't get it.

3. If I have a sample where n=207, and it follows a normal distribution, how many data points should be more than 1.97 sd (.05) below the Mean in a two-tailed test? How many should be more than 2.6 sd (.01) below the Mean?

4. Finally, what is it about the formulas for Confidence Intervals and Prediction Intervals which would cause Sample A to have a narrower Confidence Interval at .05 (in percentage terms) than Sample B, but a wider Prediction Interval at .05 (in percentage terms) than Sample B? The samples are of equal size and both satisfy tests for normality. I don't understand what would cause Sample A to have a narrower confidence interval, but a wider prediction interval. The only thing which is different in the PI formula, as opposed to the CI formula, is the sqrt(1+1/n) instead of sqrt(1/n).

Thank you.
 
Physics news on Phys.org
  • #2
As per PF posting guidelines, any and all questions at the undergrad level must be posted under the relevant homework section. All the more so if they are related to undergrad coursework.

If this the case for you please request this thread to be moved to the relevant homework section. If this is not the case please explain so I or others may proceed to answer.
__________________________
EDIT: I had misinterpreted the guidelines. All HW-style questions should be posted in the HW sections. I think your questions are fine here.
 
Last edited:
  • #3
Hello,

you want to predict [tex]X_{n+1}[/tex] from [tex]X_{1}...X_{n}[/tex] without knowing [tex]\mu[/tex] and [tex]\sigma[/tex].
So, as you said,
1. the estimated variance of [tex]\bar{X}_{n}[/tex] is [tex]S_{n}^{2}/n[/tex]
2. the estimated variance of [tex]X_{i}[/tex] is [tex]S_{n}[/tex]
3. as a result the estimated variance of [tex]X_{n + 1} - \bar{X}_{n}[/tex] is [tex]S_{n}^2 (1 + 1/n)[/tex]
So you understand where your factor [tex]\sqrt{1 + 1/n}[/tex] comes from. This is related to the fact you do not know the mean [tex]\mu[/tex] of your distribution.
 
  • #4
Huh?

EnumaElish said:
As per PF posting guidelines, any and all questions at the undergrad level must be posted under the relevant homework section. All the more so if they are related to undergrad coursework.

If this the case for you please request this thread to be moved to the relevant homework section. If this is not the case please explain so I or others may proceed to answer.

Huh? I haven't been in school in years. I have a B.A. in Political Science. I also have a J.D. My math background is limited to high school and one college course. Thus, what's homework got to do with it? Do you have to be a math or science major to post questions on this forum, or can you be someone who is attempting to learn matters outside of their discipline? I didn't see anything in the Rules pertaining to this issue, even after I was forced to read it 8 times, although it said it would only be "once." Perhaps a Thread entitled "Beginners" or something is in order, no? I couldn't imagine a thread on Constitutional Theory being this pretentious.
 
  • #5
Thank you

Barmecides said:
Hello,

you want to predict [tex]X_{n+1}[/tex] from [tex]X_{1}...X_{n}[/tex] without knowing [tex]\mu[/tex] and [tex]\sigma[/tex].
So, as you said,
1. the estimated variance of [tex]\bar{X}_{n}[/tex] is [tex]S_{n}^{2}/n[/tex]
2. the estimated variance of [tex]X_{i}[/tex] is [tex]S_{n}[/tex]
3. as a result the estimated variance of [tex]X_{n + 1} - \bar{X}_{n}[/tex] is [tex]S_{n}^2 (1 + 1/n)[/tex]
So you understand where your factor [tex]\sqrt{1 + 1/n}[/tex] comes from. This is related to the fact you do not know the mean [tex]\mu[/tex] of your distribution.

I appreciate your effort.
 
  • #6
What is the purpose of the last part of these formulas (i.e. sqrt(1/n) for CI and sqrt(1+1/n) for PI? In other words, what do these things seek to adjust? For the CI, it would seem that it's bringing the Standard Error of the Mean into the formula (i.e., sd*sqrt(1/n). But what about the PI--why is it 1+1/n? Is this bringing the Standard Error of the Estimate into the formula. I don't get it. Is it adjusting the sd forward or something?
From Wikipedia, [itex]CI = \bar X \pm t \times \sigma/\sqrt n[/itex] and [itex]PI = \bar X \pm t \times \sigma/\sqrt {1+1/n}[/itex]. The usual notation is to associate the t with a degrees of freedom (df) and a significance level (probability) [itex]\alpha\text{, e.g. } t_\alpha(df)[/itex].

I will use the formulas I cited above.

1. If each random variable Xi is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma^2[/itex] then sample average [itex]\bar X=\sum_{i=1}^N X_i/N[/itex] is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma_{\bar X}^2 = \sigma^2/n[/itex].

2. If random variable Y is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma_Y^2[/itex] then the "standard" random variable [itex]Z = (Y - \mu)/\sigma_Y[/itex] is distributed normal with mean 0 and variance 1,

It follows that [itex](\bar X-\mu)/\sigma_{\bar X}[/itex] is distributed normal with mean 0 and variance 1. If you start with a CI for the standardized variable [itex]-z_\alpha < (\bar X-\mu)/\sigma_{\bar X} < z_\alpha[/itex] then you can re-write it as [itex]\mu-z_\alpha \sigma_{\bar X}< \bar X < \mu+z_\alpha \sigma_{\bar X}[/itex] by first multiplying through with [itex]\sigma_{\bar X}[/itex] (step 1) then adding through [itex]\mu[/itex] (step 2). Now make the substitution [itex]\sigma_{\bar X} = \sigma/\sqrt n[/itex] and you have the CI stated as [itex]\mu-z_\alpha \sigma/\sqrt n< \bar X < \mu+z_\alpha \sigma/\sqrt n[/itex]. This is where the [itex]\sqrt n[/itex] comes from. (I.e. you are right as to where it comes from.)

When the variance parameter [itex]\sigma^2[/itex] is not known, it has to be estimated from data. When working with the estimated variance s2, normal distribution is replaced with the t distribution, so that the standard variable [itex](\bar X-\mu)/s_{\bar X}[/itex] is distributed t with the corresponding degrees of freedom. But the mechanics of the CI remains the same: you start with [itex]-t_\alpha(df) < (\bar X-\mu)/s_{\bar X} < t_\alpha(df)[/itex] and end with [itex]\mu-t_\alpha(df)\times s/\sqrt n< \bar X < \mu+t_\alpha(df)\times s/\sqrt n[/itex]. See http://en.wikipedia.org/wiki/Confidence_interval#Theoretical_example

For a PI, the [itex]\mu[/itex] term (which is a "nature given" constant) in all of the above is replaced by Xp, the data point being predicted. Unlike constant [itex]\mu[/itex], Xp has its own variance, which gets added up to that of [itex]\bar X[/itex] in the above formulas. That is, the variance of [itex]X_p - \bar X[/itex] is no longer [itex]s^2_\bar X[/itex], it is [itex]s^2+s^2_{\bar {X}} = s^2+s^2/n = s^2(1+1/n)[/itex]. The square root of this is the standard deviation, which is [itex]s\sqrt{1+1/n}[/itex].

So the CI for [itex]X_p - \bar X[/itex] is [itex]-t_\alpha(df)\times s\sqrt{1+1/n}< X_p - \bar X < t_\alpha(df)\times s\sqrt{1+1/n}[/itex].
 
Last edited:
  • #7
If you wanted to use the Prediction Interval and Confidence Interval formulas and project out five days, would you change the degrees of freedom for the Z-score to -5, and what about the sqrt(1/n) and sqrt(1+1/n). Would you change them to 5/n and 5+5/n?
No, you should not replace anything with "5". Suppose you estimate the equation Y(X) = 200 - 3X, where X is the number of days since the accident and Y is the dollar damage amount for the last day (e.g. lost wages for Thursday, August 2nd). Suppose your data span days 1 through 10 after the accident, but you'd like to know the damages on day 15. So you forecast: Yp(15) = 200 - 3*15 = 200 - 45 = 155. The forecast error is FE = Yp(15) - Y(15) where Y(15) is the unknown true value of the damages on day 15. The variance of the FE is [itex]\sigma^2_F=\sigma^2\left(1+1/n+(15-5.5)^2/\sum_{i=1}^{10}(X_i - 5.5)^2\right)[/itex] where 5.5 is the average of days 1-10 and X1 = 1, X2 = 2, ..., X10 = 10. As you can see, the s2 and thence the s would be larger for Xp = 25 than for Xp = 15.

I hope this is a relevant example for your area of law.
 
Last edited:
  • #8
If I have a sample where n=207, and it follows a normal distribution, how many data points should be more than 1.97 sd (.05) below the Mean in a two-tailed test?
Approximately 95% should be within 2 sd. So, approximately 95% of 207 = 197. (The 95% is calculated as 1 - .05 = 0.95.)
How many should be more than 2.6 sd (.01) below the Mean?
Left as an exercise.
 
Last edited:
  • #9
kimberley said:
Finally, what is it about the formulas for Confidence Intervals and Prediction Intervals which would cause Sample A to have a narrower Confidence Interval at .05 (in percentage terms) than Sample B, but a wider Prediction Interval at .05 (in percentage terms) than Sample B? The samples are of equal size and both satisfy tests for normality. I don't understand what would cause Sample A to have a narrower confidence interval, but a wider prediction interval. The only thing which is different in the PI formula, as opposed to the CI formula, is the sqrt(1+1/n) instead of sqrt(1/n).
The typical CI is for [itex]\bar X[/itex] (or strictly speaking [itex]\bar X-\mu[/itex], if you stop after step 1 in my response to your question #1) but the typical PI is for [itex]X_p-\bar X[/itex]. [itex]X_p\text{ and }\bar X[/itex] are different across samples. So are the estimated standard deviations.

Without knowing any specifics, if sample 1 has CI "-b1 < . < b1" and PI "-B1 < . < B1," and sample 2 has CI "-b2 < . < b2" and PI "-B2 < . < B2" such that b1 < b2 then it should be the case that B1 < B2 also. You are right to be puzzled. Specific examples would help for a more precise response.
 
Last edited:
  • #10
I couldn't imagine a thread on Constitutional Theory being this pretentious.
I apologize for sounding pretentious. I freaked out when I was force fed 8 guidelines in unrelenting sequence just as I was composing a response to your post and immediately thought "Hm, the admin must be trying to tell me something in relation to this thread." It was at best a semi-rational (CYA) kind of a reaction and I hope that you will take it into account when you evaluate my response.
 
Last edited:

What is a prediction interval?

A prediction interval is a range of values that is likely to contain the true value of a future observation or outcome, given a set of independent variables and a specified level of confidence.

How is a prediction interval different from a confidence interval?

A confidence interval is used to estimate the true value of a population parameter, while a prediction interval is used to estimate the true value of a future observation or outcome. A confidence interval is based on the variability of the sample data, while a prediction interval takes into account both the variability of the sample data and the uncertainty of future observations.

What is the purpose of a prediction interval?

The purpose of a prediction interval is to provide a range of values that can be used to make predictions about future outcomes or observations with a specified level of confidence. It allows for an understanding of the potential variability in future data points and can help with decision-making and planning.

How is a prediction interval calculated?

A prediction interval is calculated using the standard error of the regression, the sample size, and the critical value from the t-distribution corresponding to the specified level of confidence. It takes into account the variability in the data and the uncertainty of future observations to provide a range of values that is likely to contain the true value of a future observation or outcome.

What are the factors that can affect the width of a prediction interval?

The width of a prediction interval can be affected by several factors, including the level of confidence chosen, the variability of the data, the sample size, and the correlation between the independent variables. Generally, a higher level of confidence, lower variability, larger sample size, and lower correlation will result in a narrower prediction interval.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
22
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
725
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
663
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
994
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
478
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
795
Back
Top