Testing regression model with F-test

Click For Summary
SUMMARY

The discussion focuses on the application of the F-test in evaluating the significance of a regression model using 77 data points. The explained sum of squares (ESS) and residual sum of squares (RSS) are calculated to derive the F-test value, which is determined by the formula F = (ESS / DFESS) / (RSS / DFRSS). The degrees of freedom for ESS is consistently 1, while for RSS it is the number of data points minus the estimated model parameters. The participants seek clarification on the implications of varying data points on the F-test value and the conceptual foundation of the F-test in assessing model parameters.

PREREQUISITES
  • Understanding of regression analysis and model fitting
  • Familiarity with statistical concepts such as explained sum of squares (ESS) and residual sum of squares (RSS)
  • Knowledge of degrees of freedom in statistical testing
  • Basic proficiency in using statistical software for regression analysis
NEXT STEPS
  • Study the derivation and application of the F-test in regression analysis
  • Learn about the implications of degrees of freedom in statistical models
  • Explore the relationship between sample size and statistical significance in regression
  • Investigate alternative methods for assessing model fit, such as R-squared and adjusted R-squared
USEFUL FOR

Statisticians, data analysts, and researchers involved in regression modeling and hypothesis testing will benefit from this discussion, particularly those seeking to deepen their understanding of the F-test and its application in evaluating model significance.

Phoeniyx
Messages
16
Reaction score
1
Hey guys. I have some trouble understanding how the F-test is used for testing the viability of a regression model. Before I delve into the background/question, just wanted to post a link that discusses the topic briefly:
http://www.stat.yale.edu/Courses/1997-98/101/anovareg.htm

So, coming back to the question, let's say we have 77 data points (xi, yi) and we try to fit it to a regression model as:
\hat{y}_{i} = A + Bx_{i}

In the Yale example, A and B are calculated based on the 77 data points.

To check if the model is significant, we calculated the "explained sum of squares" (ESS) which is the squared difference between the model estimate and the mean: ESS = \Sigma{(\hat{y}_{i} - \bar{y})^{2}}

Then we calculate the "residual sum of squares" (RSS) which is the squared difference between the actual data point and the model: RSS = \Sigma{({y}_{i} - \hat{y}_{i})^{2}}

The degrees of freedom for RSS is: # of data points - estimated model parameters from data = 77 - 2 = 75. Perfectly fine with this.

BUT, apparently, the degrees of freedom for ESS is "1"... I do not get this. More on why I am confused later in the questions section.

The F-test value = \frac{ESS / DF_{ESS}}{RSS / DF_{RSS}}, where DF = degrees of freedom

In the Yale link above, this calculates to 8654.7/84.6 = 102.35

So, I have two questions:
1) Since the degrees of freedom of ESS is always "1", if I had 85 data points (instead of 77), the F-test value would be even larger - since ESS is not averaged and is simply the sum of squares between model value and mean. e.g. the 8654.7 above could be 9500 and the 84.6 (RSS/DF) above could be higher or lower - but will likely be still around 85 (larger sum of squares / 85). Wouldn't this imply that the significance is a function of the # of data points tested on the model (as opposed to real-world observed data points)? Related to this question, does the # of (x, y) real-world observed points used for RSS calculation (e.g. 77) must be the same as the # of points for the ESS calculation? e.g. can ESS be based on 95 model trials while RSS is only based on the 77 real world values?

2) I still don't understand (conceptually) why the F-test works to check whether the model parameters A and B are zero or not. Conceptually, how does this make sense?

Thanks very much everyone.
 
Physics news on Phys.org
Using the formulas for A and B, ESS is proportional to the variance of B, i.e. the variance of 1 degree of freedom.
 
Hi DrDu. I am sorry, but I am not understanding your response. Could you please elaborate a bit further? Thank you.
 

Similar threads

  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 24 ·
Replies
24
Views
3K