Linear regression on data collection error

Travis T · Nov 2, 2016

Hi

I've collected few sets of data and obtained significant different linear regression (R^2) in 2 particular sets of data .
Does that indicates the 2 sets of data is not validated which might due to data collection error?

For example, 20 sets of data contain linear regression of 0.900+ (0.994, 0.983, 0.932...), while the 2 sets of data contain linear regression of 0.720 and 0.810 respectively.

mfb · Nov 2, 2016

It depends on the uncertainty on R, which depends on the size of the datasets and the distribution of the data.

FactChecker · Nov 2, 2016

Are the regression equations significantly different or just a smaller R²? If 2 out of 20 are weak results, you should only be suspicious if their estimates are very different. The unusually high R² might mean that those sets have some outliers. You may want to look at the data and see if some points look unreasonable. If there are outliers pulling the regression equation out of line with the others, I would see what happens if the outliers are thrown out.

Linear regression on data collection error

1. What is linear regression on data collection error?

2. Why is it important to account for data collection error in linear regression?

3. How is data collection error identified in linear regression?

4. What are some common types of data collection error in linear regression?

5. How can data collection error be minimized in linear regression?

Similar threads

Hot Threads

Recent Insights