Effect of (variance of explanatory variables) on (Regression inferences) in SLR

  • Context: Graduate 
  • Thread starter Thread starter ych22
  • Start date Start date
  • Tags Tags
    Variables
Click For Summary
SUMMARY

The discussion centers on the impact of variance in independent variables on regression inferences in simple linear regression (SLR). It concludes that a higher variance in the independent variable X, specifically the set X=1,4,10,11,14, is more effective for determining the existence of a regression relationship due to a higher F* statistic in ANOVA. However, the discussion raises uncertainty regarding which set of X's is superior for estimating the mean response of Y at X=8, noting that variance affects the variability of Y but not its covariance with other variables.

PREREQUISITES
  • Understanding of simple linear regression (SLR)
  • Familiarity with ANOVA and F-statistics
  • Knowledge of variance and covariance concepts
  • Basic statistical analysis skills
NEXT STEPS
  • Study the role of variance in regression analysis
  • Learn about ANOVA and how to calculate F-statistics
  • Explore the implications of covariance in regression models
  • Investigate methods for estimating mean responses in regression
USEFUL FOR

Statisticians, data analysts, and researchers involved in regression modeling and statistical inference will benefit from this discussion.

ych22
Messages
114
Reaction score
0
Let me assume that we are performing an experiment to build a regression model with independent variable X and dependent variable Y.

Then somehow, we have a choice between X=1,4,10,11,14 or X= 6,7,8,9,10. The mean of both sets of X's is 8, but the variance of the first set of X's is much higher than the latter set.

Which set of X's is better for:
A) Determining whether a regression relation exists.
B) Estimating the mean response of Y at X=8.

I think that the former set is better for determining whether a regression relation exists. Because the F* statistic in ANOVA is given by MSR/MSE. E[MSE]= [tex]\sigma[/tex]2while E[MSR]= [tex]\sigma[/tex]2 + [tex]\beta[/tex]12[tex]\sum[/tex](Xi-[tex]\overline{X}[/tex])2. When there is no relation between X and Y, then obviously the choice of X's does not matter. However when the relation exists, then E[MSR] is higher with higher variance of the X's. So F* is expected to be higher, and more likely to conclude that the relation exists.

However, I am not too sure which set of X is better for estimating the mean response of Y at X=8. Although the first set should be better for estimating the variability in the response of Y at X=8...
 
Physics news on Phys.org
Variance of the independent variable has nothing to do with its covariance with any other variable. Variance depends on one variable and covariance depends jointly on two variables.
 

Similar threads

  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 64 ·
3
Replies
64
Views
6K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 10 ·
Replies
10
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
4K