Is this a good forecasting model?

musicgold · Dec 30, 2013

Hi,

Please see the attached Excel file.

I have a sample of 70 data pairs. The correlation between X and Y is -0.68. The OLS regression coefficient is statistically significant as shown in the file. However, with a R^2 of 0.40, I am not sure if my model would be good enough to forecast Y.
Can you please take a look?

Thanks.

Office_Shredder · Dec 30, 2013

The phrase "good enough to forecast Y" is a bit loaded. How well do you want to forecast Y? If you simply want to know whether this is sufficient to imply that X and Y are not uncorrelated then you should do a t-test:

http://en.wikipedia.org/wiki/Student's_t-test#Slope_of_a_regression_line

musicgold · Dec 30, 2013

Thanks.

If you simply want to know whether this is sufficient to imply that X and Y are not uncorrelated then you should do a t-test:

As shown in the Excel file, the regression analysis gets a very low p-value for the coefficient, so I know they are related (or not independent).

Separately, I also calculated the t-statistic of the correlation coefficient, which was 6.7, i.e. there is a very low chance that the sample correlation value occurred randomly. So I am quite confident that there is a statistically significant relationship.

What is making me nervous is the relatively low value of R^2 of the regression line. I am not sure how confident I should be about the predictions by this model.

mfb · Dec 30, 2013

musicgold said:

What is making me nervous is the relatively low value of R^2 of the regression line. I am not sure how confident I should be about the predictions by this model.

As you can see in the graph, knowing the x-value won't give a reliable prediction for y. It is better than not knowing the x-value (that's what the non-zero correlation tells you), but the spread of the datapoints is quite large.

Number Nine · Dec 30, 2013

If you want to determine the predictive value of your model, set aside a portion of your data to use for validation (or collect new measurements and use them for validation). I agree with mfb that the model probably won't have a great deal of predictive value.

the regression analysis gets a very low p-value for the coefficient, so I know they are related (or not independent).

Separately, I also calculated the t-statistic of the correlation coefficient, which was 6.7, i.e. there is a very low chance that the sample correlation value occurred randomly.

You should be aware that the p- and t- values don't really allow you to say any of those things. Null hypothesis testing (especially the p-value) is very commonly misinterpreted; the wikipedia article contains a list of common misconceptions that you may want to read.

musicgold · Dec 30, 2013

mfb said:

the spread of the datapoints is quite large.

What do you mean by this?

mfb · Dec 30, 2013

musicgold said:

What do you mean by this?

See the highlighted areas in the attached image - very similar x-values (within each of them), but a large variation in y. Your prediction can be something like "it is probably within that y-range", but not better than that.

attachment.php?attachmentid=65221&d=1388424451.png

musicgold · Dec 30, 2013

Got it. Thanks.

musicgold · Jan 2, 2014

Number Nine said:

If you want to determine the predictive value of your model, set aside a portion of your data to use for validation (or collect new measurements and use them for validation).

So should I go back, take only 60 of the 74 points and run the regression analysis again and see how the new model predicts the Y values for the remaining 14 X-values?

If yes, how should I go about selecting the 60 points, randomly?

Thanks.

Is this a good forecasting model?

Attachments

Attachments

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

I Stochastic calculus: Ito's lemma and differentials

I The reason for lambda calculus being universal

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective