Multiple linear regression: partial F-test

Click For Summary
In multiple linear regression, the partial F-test compares a model with three independent variables to one with five, assessing whether the additional variables significantly improve the model. The increase in the coefficient of multiple determination (R^2) occurs regardless of whether the new variables contribute meaningfully, making it essential to test the null hypothesis that the additional coefficients are zero. If the null hypothesis is rejected, it indicates that the increase in R^2 is statistically significant and not due to chance. The relationship between the F-test and R^2 is established through the calculation of the coefficient of partial determination, which measures the reduction in error sum of squares when adding the new variables. Thus, the partial F-test effectively evaluates the significance of the increase in R^2.
kingwinner
Messages
1,266
Reaction score
0
"Suppose that in a MULTIPLE linear regression analysis, it is of interest to compare a model with 3 independent variables to a model with the same response varaible and these same 3 independent variables plus 2 additional independent variables.
As more predictors are added to the model, the coefficient of multiple determination (R^2) will increase, so the model with 5 predicator variables will have a higher R^2.
The partial F-test for the coefficients of the 2 additional predictor variables (H_o: β_4=β_5=0) is equivalent to testing that the increase in R^2 is statistically signifcant."


I don't understand the bolded sentence. Why are they equivalent?

Thanks for explaining!
 
Physics news on Phys.org
Mathematically R^2 will increase whether or not the new variables contribute to the model. Because of this, the question in practice is whether the larger R^2 is due simply to the math (this corresponds to H_0 \colon \beta_4 = \beta_5 = 0) or whether the increase is due to at least one of the two coefficients is non-zero (this would be the alternative hypothesis that at least one of the two coefficients is non-zero). If H_0 is rejected, we know at least one coefficient is non-zero, and we also know that the increase in R^2 is due to something other than mere chance.

Does this help, or were you looking for a mathematical explanation?
 
Do you have a mathematical explanation for that?

The statement claims that the test of H_o: β_4 = β_5 = 0 is equivalent to testing that the increase in R^2 is statistically signifcant. What would be the equivalent null and alternative hypotheses in terms of R^2?

Thanks!
 
Last edited:
Suppose you have a total of five variables (since you reference \beta_4, \beta_5

We want to test

<br /> \begin{align*}<br /> H_0 \colon &amp; \beta_4 = \beta_5 = 0 \\<br /> H_a \colon &amp; \text{At least one of } \beta_4, \beta_5 \ne 0<br /> \end{align*}<br />

The test begins with the fitting of a full and a reduced model:

<br /> Y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 + \beta_4 x_4 + \beta_5 x_5 \tag{Full}<br />

<br /> Y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_3 \tag{Reduced}<br />

Denote the sum of squares for error in the full model by SSE(F) = SSE(x_1, x_2, x_3, x_4, x_5), and the sum of squares for error in the reduced model by SSE(R) = SSE(x_1, x_2, x_3)

Since we use more variables in the full model than in the reduced model, we will see SSE(F) &lt; SSE(R). The test statistic for the above hypotheses are

<br /> F = \frac{SSE(R) - SSE(F)}{(n-4) - (n-6)} \div \frac{SSE(F)}{n-6}<br />

In the old days (to be read as "when statdad was in school") the numerator of this statistic was written as

<br /> SSE(R) - SSE(F) = SSE(X_1, X_2, X_3) - SSE(X_1, X_2, X_3, X_4, X_5) = SSR(X_4, X_5 \mid X_1, X_2, X_3)<br />

Think of the last notation ("sum of squares R eduction") as denoting the reduction in variation from adding x_4, x_5 to a model that already contains the other three variables. The test is done by comparing F to the appropriate tables.

How is this related to R^2? It isn't, directly, it is related to something called a coefficient of partial determination . The first bit of notation is this:

<br /> r^2_{Y45.123}<br />

In the subscript the numbers to the left of the "." are the dependent variable and the "number label" of the variables being added to the model, while the numbers to the right of the "." are the "number labels" of the variables originally in the model. The coefficient of partial determination is calculated as

<br /> r^2_{Y45.123} = \frac{SSR(X_4, X_5 \mid X_1, X_2, X_3)}{SSE(X_1, X_2, X_3)}<br />

Technically, this measures the percentage reduction in error sum of squares that results when we move from the model with 3 variables to the model with all 5 variables.

When the F-test referred to above is significant) (H_0 is rejected), this coefficient of partial determination indicates a significant [/tex] change in R^2

Hope this helped.
 
  • Like
Likes WWGD
Thanks!

R^2 = regression SS/total SS

F = [(R^2_full - R^2_reduced) / (5 -3)] / [(1 - R^_full) / (n - 5 - 1)] .
where R^2_full is the R^2 with 5 independent variables and R^2_reduced is the R^2 with 3 independent variables

Based on this form of the F statistic, can we say that the partial F-test for the coefficients of the 2 additional predictor variables (H_o: β_4=β_5=0) is equivalent to testing that the increase in R^2 is statistically signifcant?
 
Yes - good job.
 
The standard _A " operator" maps a Null Hypothesis Ho into a decision set { Do not reject:=1 and reject :=0}. In this sense ( HA)_A , makes no sense. Since H0, HA aren't exhaustive, can we find an alternative operator, _A' , so that ( H_A)_A' makes sense? Isn't Pearson Neyman related to this? Hope I'm making sense. Edit: I was motivated by a superficial similarity of the idea with double transposition of matrices M, with ## (M^{T})^{T}=M##, and just wanted to see if it made sense to talk...

Similar threads

  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
4K
Replies
3
Views
3K
  • · Replies 23 ·
Replies
23
Views
4K
  • · Replies 64 ·
3
Replies
64
Views
5K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 8 ·
Replies
8
Views
2K