Discussion Overview
The discussion revolves around logistic regression, specifically estimating the probability of survival based on a single predictor, Sex. Participants explore the mathematical formulation of the logistic model, the interpretation of estimated probabilities, and the computation of p-values for coefficients in logistic regression. The conversation includes technical details, assumptions, and challenges related to model accuracy and predictor significance.
Discussion Character
- Technical explanation
- Mathematical reasoning
- Debate/contested
Main Points Raised
- Some participants question why the estimated probabilities for males and females do not sum to 1, suggesting it may be due to the nature of estimates or the influence of other predictors.
- There is a discussion about the correct formulation of survival probabilities for males and females, with some participants clarifying the mathematical expressions involved.
- One participant outlines the steps for calculating the p-value for the coefficient of the predictor using standard error and t-statistics, while others seek clarification on these calculations.
- Concerns are raised about the appropriateness of using t-statistics in logistic regression, with a suggestion that z-statistics should be used instead.
- Participants discuss the availability of libraries in Python and R for calculating p-values, noting differences in their usage among data scientists.
- There is a debate on how to handle coefficients with small values and their associated p-values, with some participants suggesting criteria for dropping predictors from the model.
- One participant expresses a belief that certain predictors may not significantly affect outcomes based on both statistical evidence and logical reasoning.
Areas of Agreement / Disagreement
Participants express differing views on the interpretation of logistic regression outputs, the appropriateness of statistical methods, and the significance of predictors. No consensus is reached on the best approach to model evaluation and predictor selection.
Contextual Notes
Participants note that some statistical methods discussed may be more applicable to linear regression than logistic regression, highlighting potential limitations in the assumptions made during the discussion.