F-test explained in laymans terms

  • Thread starter jprockbelly
  • Start date
  • Tags
    Terms
In summary, the conversation discussed a multiple linear regression fit using least squares in Excel for a project, with two independent variables (rainfall and time) being used to predict groundwater level. The P-value and F-statistic were automatically generated and the individual was seeking a simple explanation of the F-test. The F-value is a joint test of all independent variable coefficients being different from zero, and can be calculated using the t-statistic. The F-test helps determine whether the mean value of Y is constant or varies according to the model. An analogy using ball bearings was given to explain the concept of the F-distribution and the use of squared residuals in the test.
  • #1
jprockbelly
5
0
Hello, I have created a multiple linear regression fit (using least squares) for a project. The regression has two independent variables, rainfall and time, and fits these to groundwater level. The regression was calculated automatically in Excel. I have been asked to report on the P-value and F-statistic for this regression (both generated automatically by program). I have read some good explanations of P-value, but cannot find any simple explanation of the F-test or F-statistic.

Can anyone provide, or recommend a simple explanation?
 
Physics news on Phys.org
  • #2
The F value is the value of the F statistic on a joint test of all indep. variable coefficients ("the betas") simultaneously being different from zero. With a single indep. var., the F test reduces to testing the indep. var. coefficient (the beta) being different from zero. This is the same hypothesis tested by the t Stat for the beta.

Try the following: drop one of your indep. variables; then verify that F = (t Stat)^2 and t Stat = SQRT(F).
 
Last edited:
  • #3
Remember that in regression you're investigating whether the independent variables provide any useful information for you to use in the prediction of the mean value of Y (this is a simplified comment, but we're talking about linear regression so it works).

The basic hypotheses for the F-test are
[tex] \begin{align*}
H_0 \colon & \beta_1 = \beta_2 = 0 \\
H_a \colon & \text{At least one coefficient is not zero}
\end{align*}
[/tex]

If the null hypothesis is true you're left with the result that the best way to estimate the mean value of Y is with the ordinary sample mean. If the alternative hypothesis is true then you can say your data indicates the mean value of Y is not constant but varies in a way consistent with your model.

In short: the F-test provides a way to distinguish which of two models (constant mean vs variable mean) best describes the variable Y.
 
  • #4
Thanks
 
  • #5
I used to teach statistics to chemists (wannabe physicists that couldn't handle the math <G>) and found that a paticular thought experiment that explored HOW the F-distribution might be generated was useful. It goes like this:

Suppose you have a very large container of ball bearings. You extract 3 bearings ("randomly") and measure their average weight. Now you extract 5 bearings and measure the average weight of those five. You calculate the ratio. You continue this process of calculating 5 & 3 average ball bearing weight ratios and build a histogram. After you do this an "infinite" number of times you have a facsimile of the F(5,3) distribution (sort of).

Now the key idea - I pull three ball bearings from my pocket. I ask you "what is the probability that those three ball bearings came from the "big container"? The way you answer my question is to grab 5 bearings (at "random") from the big container average their weight. That average weight is compared to the average weight of the three bearing I pulled from my pocket. The position of this "experimental" ratio is located on the histogram you so laboriously constructed. Since the histogram is a picture of a probability distribution you can determine what the probability is that the three ball bearings were pulled from the "big container". In other words, what are the chances that the "big container" could produce a 5 - 3 ratio like the one you measured using the three from my pocket.

Substitute residuals for ball bearings. The "big container" contains "random" error. Why square the residuals before consulting an F-distribution? Because it gets rid of negative numbers which could produce zero values upon averaging.

Why not use 4th powers of residuals instead of squares? You could but you would have to construct the distribution - the distribution of squared values is already made for you.

I hope that helps a little. It is not rigorous but I do not think you are looking for rigor.
 
  • #6
WACG, thanks for that. It actually helps alot.
 

1. What is an F-test?

An F-test is a statistical test used to compare the variances of two or more groups. It can also be used to determine whether there is a significant difference between the means of two or more groups.

2. How does an F-test work?

An F-test works by calculating the ratio of the variances between groups to the variances within groups. This ratio is then compared to a critical value from a statistical table to determine if there is a significant difference between the groups.

3. When should I use an F-test?

An F-test should be used when you want to compare the variances or means of two or more groups. It is commonly used in scientific research to analyze experimental data and determine if there are any significant differences between groups.

4. What is the difference between a one-way and two-way F-test?

A one-way F-test is used to compare the means or variances of two or more independent groups. A two-way F-test, on the other hand, is used to compare the means or variances of two or more groups that have been split into different categories or factors.

5. Can I interpret the results of an F-test on my own?

It is recommended to seek the help of a statistician when interpreting the results of an F-test, especially if you are not familiar with statistical analysis. They can help you understand the significance of the results and draw accurate conclusions from the data.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
6K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
477
  • Set Theory, Logic, Probability, Statistics
Replies
26
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
21
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
16
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
4K
Back
Top