MHB Testing a Linear Stepwise Regression Model - Need Advice!

davemk · Apr 22, 2012

Hi folks.

Just looking for some input please.

I have a dataset containing interval data (one dependent and 6 independent variables) and taken a random 90% sample (approx 300 observations). I've performed a linear stepwise regression on the 90%, in order to obtain a model to predict the dependent using a number of input variables. I'm confident that I've done this ok.

The issue comes with testing the model. I'm sure that this is probably a simple step but, for some reason, I'm really struggling with it and would be grateful for some advice.

In order to test the model, I'm using the 10% of the dataset that were not used in the linear regression. I've input the predictor variables into the model, which has given me an expected value. I now want to compare this to the actual value. I was originally going to use Chi Square but that seems to be probability based and I'm not sure it's appropriate.

I've been told Spearman's rho would probably be most appropriate although I'm still not 100% sure that's right. Essentially, I would only be testing whether my predicted values = actual values.All help appreciated. Thanks in advance.

CaptainBlack · Apr 23, 2012

davemk said:

Hi folks.

Just looking for some input please.

I have a dataset containing interval data (one dependent and 6 independent variables) and taken a random 90% sample (approx 300 observations). I've performed a linear stepwise regression on the 90%, in order to obtain a model to predict the dependent using a number of input variables. I'm confident that I've done this ok.

The issue comes with testing the model. I'm sure that this is probably a simple step but, for some reason, I'm really struggling with it and would be grateful for some advice.

In order to test the model, I'm using the 10% of the dataset that were not used in the linear regression. I've input the predictor variables into the model, which has given me an expected value. I now want to compare this to the actual value. I was originally going to use Chi Square but that seems to be probability based and I'm not sure it's appropriate.

I've been told Spearman's rho would probably be most appropriate although I'm still not 100% sure that's right. Essentially, I would only be testing whether my predicted values = actual values.All help appreciated. Thanks in advance.

To some extent this depends on how clever you want to be. What you want to do is test that the residuals for the hold back sample have zero mean and that they are homoscedastic. With about 30 points you may have difficulty doing much more.

For the first of these I would just test for zero mean using the usual methods.

For the latter I would plot the residuals against the input variables and eyeball the data (at least to start with), but there are tests, see http://en.wikipedia.org/wiki/Homoscedasticity for a pointer.

You might also want to test the residuals for normality.

CB

davemk · Apr 23, 2012

That's a great help, thank you very much.

I've already plotted the residuals for obs vs expected and histograms for normailty so I'll have a look into the tests within the link you posted (I must admit, I've never heard of those tests so I'll have a read up on those).

Thanks again. I'll update the thread with my progress asap.

davemk · Apr 28, 2012

CaptainBlack said:

To some extent this depends on how clever you want to be.

With about 30 points you may have difficulty doing much more.

Hello again. If I was to get more data (say 70 observations) in order to test the model, is there a specific test that I could use? At the moment, I've performed a residual analysis and then I'm looking at performing a Wilcoxon's test or Spearman's test.

Any thoughts on this process, or alternatives? The procedures in the link above don't appear to be available in SPSS.

MHB Testing a Linear Stepwise Regression Model - Need Advice!

Thread 'My basic understanding of set theory'

Similar threads

Undergrad A variant of the Monty Hall problem

Undergrad Please Explain (actually explain) The Monty Hall Problem

Undergrad What Are the Axioms of Fuzzy Logic and How Do They Extend Boolean Algebra?

High School How Rare Is Low Smartphone Usage Among Metro Travelers in Japan?

High School Onto set mapping is the surjective set mapping, and into injective?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers