Prediction Intervals (Critical Levels)

kimberley · Aug 1, 2007

Hello Enuma et als.

I have a few related questions regarding Prediction Intervals for outliers (individual data points). The basic formula that I use is: Mean +/- T(-1df)*sd*sqrt(1+1/n), where Mean is the sample mean, T is the Z-score from the Student's T Distribution Table minus 1 degree of freedom, and sd is the sample standard deviation. This formula can be found in many publications, and at Wikipedia's entry on "Prediction Intervals". The formula for Confidence Intervals is the same except for the final part of the formula which is sqrt(1/n) as opposed to sqrt(1+1/n).

QUESTIONS:

1. What is the purpose of the last part of these formulas (i.e. sqrt(1/n) for CI and sqrt(1+1/n) for PI? In other words, what do these things seek to adjust? For the CI, it would seem that it's bringing the Standard Error of the Mean into the formula (i.e., sd*sqrt(1/n). But what about the PI--why is it 1+1/n? Is this bringing the Standard Error of the Estimate into the formula. I don't get it. Is it adjusting the sd forward or something?

2. If you wanted to use the Prediction Interval and Confidence Interval formulas and project out five days, would you change the degrees of freedom for the Z-score to -5, and what about the sqrt(1/n) and sqrt(1+1/n). Would you change them to 5/n and 5+5/n? As I state above, I just don't get it.

3. If I have a sample where n=207, and it follows a normal distribution, how many data points should be more than 1.97 sd (.05) below the Mean in a two-tailed test? How many should be more than 2.6 sd (.01) below the Mean?

4. Finally, what is it about the formulas for Confidence Intervals and Prediction Intervals which would cause Sample A to have a narrower Confidence Interval at .05 (in percentage terms) than Sample B, but a wider Prediction Interval at .05 (in percentage terms) than Sample B? The samples are of equal size and both satisfy tests for normality. I don't understand what would cause Sample A to have a narrower confidence interval, but a wider prediction interval. The only thing which is different in the PI formula, as opposed to the CI formula, is the sqrt(1+1/n) instead of sqrt(1/n).

Thank you.

EnumaElish · Aug 1, 2007

As per PF posting guidelines, any and all questions at the undergrad level must be posted under the relevant homework section. All the more so if they are related to undergrad coursework.

If this the case for you please request this thread to be moved to the relevant homework section. If this is not the case please explain so I or others may proceed to answer.
__________________________
EDIT: I had misinterpreted the guidelines. All HW-style questions should be posted in the HW sections. I think your questions are fine here.

Barmecides · Aug 2, 2007

Hello,

you want to predict [tex]X_{n+1}[/tex] from [tex]X_{1}...X_{n}[/tex] without knowing [tex]\mu[/tex] and [tex]\sigma[/tex].
So, as you said,
1. the estimated variance of [tex]\bar{X}_{n}[/tex] is [tex]S_{n}^{2}/n[/tex]
2. the estimated variance of [tex]X_{i}[/tex] is [tex]S_{n}[/tex]
3. as a result the estimated variance of [tex]X_{n + 1} - \bar{X}_{n}[/tex] is [tex]S_{n}^2 (1 + 1/n)[/tex]
So you understand where your factor [tex]\sqrt{1 + 1/n}[/tex] comes from. This is related to the fact you do not know the mean [tex]\mu[/tex] of your distribution.

kimberley · Aug 2, 2007

Huh?

EnumaElish said:

As per PF posting guidelines, any and all questions at the undergrad level must be posted under the relevant homework section. All the more so if they are related to undergrad coursework.

If this the case for you please request this thread to be moved to the relevant homework section. If this is not the case please explain so I or others may proceed to answer.

Huh? I haven't been in school in years. I have a B.A. in Political Science. I also have a J.D. My math background is limited to high school and one college course. Thus, what's homework got to do with it? Do you have to be a math or science major to post questions on this forum, or can you be someone who is attempting to learn matters outside of their discipline? I didn't see anything in the Rules pertaining to this issue, even after I was forced to read it 8 times, although it said it would only be "once." Perhaps a Thread entitled "Beginners" or something is in order, no? I couldn't imagine a thread on Constitutional Theory being this pretentious.

kimberley · Aug 2, 2007

Thank you

Barmecides said:

Hello,

you want to predict [tex]X_{n+1}[/tex] from [tex]X_{1}...X_{n}[/tex] without knowing [tex]\mu[/tex] and [tex]\sigma[/tex].
So, as you said,
1. the estimated variance of [tex]\bar{X}_{n}[/tex] is [tex]S_{n}^{2}/n[/tex]
2. the estimated variance of [tex]X_{i}[/tex] is [tex]S_{n}[/tex]
3. as a result the estimated variance of [tex]X_{n + 1} - \bar{X}_{n}[/tex] is [tex]S_{n}^2 (1 + 1/n)[/tex]
So you understand where your factor [tex]\sqrt{1 + 1/n}[/tex] comes from. This is related to the fact you do not know the mean [tex]\mu[/tex] of your distribution.

I appreciate your effort.

EnumaElish · Aug 2, 2007

What is the purpose of the last part of these formulas (i.e. sqrt(1/n) for CI and sqrt(1+1/n) for PI? In other words, what do these things seek to adjust? For the CI, it would seem that it's bringing the Standard Error of the Mean into the formula (i.e., sd*sqrt(1/n). But what about the PI--why is it 1+1/n? Is this bringing the Standard Error of the Estimate into the formula. I don't get it. Is it adjusting the sd forward or something?

From Wikipedia, [itex]CI = \bar X \pm t \times \sigma/\sqrt n[/itex] and [itex]PI = \bar X \pm t \times \sigma/\sqrt {1+1/n}[/itex]. The usual notation is to associate the t with a degrees of freedom (df) and a significance level (probability) [itex]\alpha\text{, e.g. } t_\alpha(df)[/itex].

I will use the formulas I cited above.

1. If each random variable X_i is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma^2[/itex] then sample average [itex]\bar X=\sum_{i=1}^N X_i/N[/itex] is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma_{\bar X}^2 = \sigma^2/n[/itex].

2. If random variable Y is distributed normal with [itex]\text{mean }\mu\text{ and variance }\sigma_Y^2[/itex] then the "standard" random variable [itex]Z = (Y - \mu)/\sigma_Y[/itex] is distributed normal with mean 0 and variance 1,

It follows that [itex](\bar X-\mu)/\sigma_{\bar X}[/itex] is distributed normal with mean 0 and variance 1. If you start with a CI for the standardized variable [itex]-z_\alpha < (\bar X-\mu)/\sigma_{\bar X} < z_\alpha[/itex] then you can re-write it as [itex]\mu-z_\alpha \sigma_{\bar X}< \bar X < \mu+z_\alpha \sigma_{\bar X}[/itex] by first multiplying through with [itex]\sigma_{\bar X}[/itex] (step 1) then adding through [itex]\mu[/itex] (step 2). Now make the substitution [itex]\sigma_{\bar X} = \sigma/\sqrt n[/itex] and you have the CI stated as [itex]\mu-z_\alpha \sigma/\sqrt n< \bar X < \mu+z_\alpha \sigma/\sqrt n[/itex]. This is where the [itex]\sqrt n[/itex] comes from. (I.e. you are right as to where it comes from.)

When the variance parameter [itex]\sigma^2[/itex] is not known, it has to be estimated from data. When working with the estimated variance s², normal distribution is replaced with the t distribution, so that the standard variable [itex](\bar X-\mu)/s_{\bar X}[/itex] is distributed t with the corresponding degrees of freedom. But the mechanics of the CI remains the same: you start with [itex]-t_\alpha(df) < (\bar X-\mu)/s_{\bar X} < t_\alpha(df)[/itex] and end with [itex]\mu-t_\alpha(df)\times s/\sqrt n< \bar X < \mu+t_\alpha(df)\times s/\sqrt n[/itex]. See http://en.wikipedia.org/wiki/Confidence_interval#Theoretical_example

For a PI, the [itex]\mu[/itex] term (which is a "nature given" constant) in all of the above is replaced by X_p, the data point being predicted. Unlike constant [itex]\mu[/itex], X_p has its own variance, which gets added up to that of [itex]\bar X[/itex] in the above formulas. That is, the variance of [itex]X_p - \bar X[/itex] is no longer [itex]s^2_\bar X[/itex], it is [itex]s^2+s^2_{\bar {X}} = s^2+s^2/n = s^2(1+1/n)[/itex]. The square root of this is the standard deviation, which is [itex]s\sqrt{1+1/n}[/itex].

So the CI for [itex]X_p - \bar X[/itex] is [itex]-t_\alpha(df)\times s\sqrt{1+1/n}< X_p - \bar X < t_\alpha(df)\times s\sqrt{1+1/n}[/itex].

EnumaElish · Aug 2, 2007

If you wanted to use the Prediction Interval and Confidence Interval formulas and project out five days, would you change the degrees of freedom for the Z-score to -5, and what about the sqrt(1/n) and sqrt(1+1/n). Would you change them to 5/n and 5+5/n?

No, you should not replace anything with "5". Suppose you estimate the equation Y(X) = 200 - 3X, where X is the number of days since the accident and Y is the dollar damage amount for the last day (e.g. lost wages for Thursday, August 2nd). Suppose your data span days 1 through 10 after the accident, but you'd like to know the damages on day 15. So you forecast: Y_p(15) = 200 - 3*15 = 200 - 45 = 155. The forecast error is FE = Y_p(15) - Y(15) where Y(15) is the unknown true value of the damages on day 15. The variance of the FE is [itex]\sigma^2_F=\sigma^2\left(1+1/n+(15-5.5)^2/\sum_{i=1}^{10}(X_i - 5.5)^2\right)[/itex] where 5.5 is the average of days 1-10 and X₁ = 1, X₂ = 2, ..., X₁₀ = 10. As you can see, the s² and thence the s would be larger for X_p = 25 than for X_p = 15.

I hope this is a relevant example for your area of law.

EnumaElish · Aug 2, 2007

If I have a sample where n=207, and it follows a normal distribution, how many data points should be more than 1.97 sd (.05) below the Mean in a two-tailed test?

Approximately 95% should be within 2 sd. So, approximately 95% of 207 = 197. (The 95% is calculated as 1 - .05 = 0.95.)

How many should be more than 2.6 sd (.01) below the Mean?

Left as an exercise.

EnumaElish · Aug 2, 2007

kimberley said:

Finally, what is it about the formulas for Confidence Intervals and Prediction Intervals which would cause Sample A to have a narrower Confidence Interval at .05 (in percentage terms) than Sample B, but a wider Prediction Interval at .05 (in percentage terms) than Sample B? The samples are of equal size and both satisfy tests for normality. I don't understand what would cause Sample A to have a narrower confidence interval, but a wider prediction interval. The only thing which is different in the PI formula, as opposed to the CI formula, is the sqrt(1+1/n) instead of sqrt(1/n).

The typical CI is for [itex]\bar X[/itex] (or strictly speaking [itex]\bar X-\mu[/itex], if you stop after step 1 in my response to your question #1) but the typical PI is for [itex]X_p-\bar X[/itex]. [itex]X_p\text{ and }\bar X[/itex] are different across samples. So are the estimated standard deviations.

Without knowing any specifics, if sample 1 has CI "-b1 < . < b1" and PI "-B1 < . < B1," and sample 2 has CI "-b2 < . < b2" and PI "-B2 < . < B2" such that b1 < b2 then it should be the case that B1 < B2 also. You are right to be puzzled. Specific examples would help for a more precise response.

EnumaElish · Aug 3, 2007

I couldn't imagine a thread on Constitutional Theory being this pretentious.

I apologize for sounding pretentious. I freaked out when I was force fed 8 guidelines in unrelenting sequence just as I was composing a response to your post and immediately thought "Hm, the admin must be trying to tell me something in relation to this thread." It was at best a semi-rational (CYA) kind of a reaction and I hope that you will take it into account when you evaluate my response.

Prediction Intervals (Critical Levels)

Graduate Expected numbers of cards of a last color remaining

Graduate Probability puzzle

Undergrad The problem of points

Undergrad The countability paradox of computable numbers

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Prediction Intervals (Critical Levels)

Similar threads