Discussion Overview
The discussion revolves around issues encountered in regression analysis, specifically regarding the normality of residuals when modeling data with age and gender as variables. Participants explore the implications of non-random residuals and consider potential model adjustments, including the possibility of non-linear relationships.
Discussion Character
- Exploratory
- Technical explanation
- Debate/contested
- Mathematical reasoning
Main Points Raised
- One participant notes that the residual plot is not random, suggesting that the simple linear model may not be appropriate for the data.
- Another participant proposes that the observed curvature in the residuals indicates a potential non-linear relationship, questioning whether visual inspection of the scatter plot can confirm this.
- Some participants suggest incorporating higher-order terms (squares or cubes) into the regression model to better fit the data.
- There is a discussion about the challenges of using a binary variable (gender) in regression analysis, with questions about how to handle it alongside numerical variables like age.
- Concerns are raised regarding the lack of specific data or plots provided by the original poster, which complicates the ability to give meaningful advice.
- Some participants argue that a regression of gender on age does not make sense, indicating a need for clarification on the dependent variable in the analysis.
- There is mention of logistic regression as a potential approach, given the binary nature of the gender variable, but this is debated in the context of the unspecified dependent variable.
Areas of Agreement / Disagreement
Participants express multiple competing views regarding the appropriateness of the regression model and how to address the issues with residuals. There is no consensus on the best approach, and the discussion remains unresolved regarding the specifics of the dependent variable and the overall modeling strategy.
Contextual Notes
Participants highlight limitations due to the absence of specific data, plots, or a clear definition of the dependent variable, which affects the ability to provide targeted advice.