Regression Analysis Homework: Best Model & Appropriateness

In summary, regression analysis is a statistical method used to examine the relationship between variables and predict the behavior of one variable based on others. The best model is determined by evaluating measures such as R-squared and RMSE. It is appropriate when assumptions are met and the data is continuous and large enough. It can be used for both categorical and continuous variables. Some common pitfalls to avoid include not checking assumptions, using a small sample size, and not considering practical significance.
  • #1
david118
4
0

Homework Statement


based on this data
http://www.stat.ufl.edu/~rrandles/sta4210/4210lectures/secondexreview/exam2rev.pdf

1) Consider the full (three predictor) model. Is this model useful? (are any of the predictors worthwhile?)

2) Use the All-Subsets and conduct a search for the best model using our five criteria.

3) Examine the approprateness of the model chosen in 2.

4) Conduct a test of whether X2 - (locadv) should be dropped from the three variable model.


Homework Equations





The Attempt at a Solution



So,
1) I would say the model is useful based upon the fact that the F-Value in the anova test is larger than the one on the F-table based upon the degrees of freedom.

2) According the all-subsets data, the best model uses X1 and X3

3)Since the F-test is greater than slated F-table value, it is appropriate. (NOT SURE ABOUT THIS PART)

4) Since the model is better than the all three, that variable should be dropped. (NOT SURE ABOUT THIS)

thanks
 
Physics news on Phys.org
  • #2
for your post! Here is my response to your questions:

1) Based on the data provided, it does appear that the full (three predictor) model is useful. The F-value in the ANOVA test is larger than the one on the F-table, indicating that at least one of the predictors is having a significant effect on the outcome. However, we cannot determine which specific predictors are worthwhile without further analysis.

2) To determine the best model, we can use the All-Subsets method and conduct a search using the five criteria (e.g. adjusted R-squared, AIC, BIC, etc.). This will allow us to compare different models and select the one that best fits the data.

3) The appropriateness of the model chosen in 2 will depend on the criteria used to select it. If the chosen model has a high adjusted R-squared and low AIC and BIC values, then it can be considered appropriate. However, it is important to also consider the assumptions of the model and whether they are being met.

4) To determine if X2 - (locadv) should be dropped from the three variable model, we can conduct a test such as a t-test or F-test to assess if the coefficient for this predictor is significantly different from zero. If the p-value for this test is less than the chosen significance level (e.g. 0.05), then we can conclude that this predictor should be kept in the model. Otherwise, it can be dropped.
 

1. What is regression analysis and why is it important?

Regression analysis is a statistical method used to examine the relationship between two or more variables. It is important because it allows us to understand and predict the behavior of one variable (dependent variable) based on the values of other variables (independent variables).

2. How do I determine the best model for my regression analysis?

The best model for regression analysis is determined by evaluating the goodness-of-fit measures such as R-squared, Adjusted R-squared, and Root Mean Square Error (RMSE). A higher R-squared and lower RMSE indicate a better fit for the model.

3. What is the appropriateness of using regression analysis?

Regression analysis is appropriate when there is a linear relationship between the dependent and independent variables, and the data follows the assumptions of normality, linearity, and homoscedasticity. It is also appropriate when the data is continuous and the sample size is large enough to provide reliable results.

4. Can regression analysis be used for both categorical and continuous variables?

Yes, regression analysis can be used for both categorical and continuous variables. For categorical variables, dummy variables can be created to represent the categories and included in the regression model.

5. What are some common pitfalls to avoid when conducting regression analysis?

Some common pitfalls to avoid when conducting regression analysis include not checking for the assumptions of the model, using a small sample size, including highly correlated variables in the model, and not considering the practical significance of the results.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
30
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Calculus and Beyond Homework Help
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
13
Views
1K
  • Calculus and Beyond Homework Help
Replies
1
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
1K
Replies
3
Views
5K
Back
Top