Discussion Overview
The discussion centers on the challenges of performing linear regression analysis when both independent and dependent variables have measurement errors. Participants explore various methods and formulas for calculating the best fitting line, including considerations for slope and intercept variances.
Discussion Character
- Technical explanation
- Debate/contested
- Mathematical reasoning
Main Points Raised
- One participant seeks sources and formulas for calculating the slope and intercept in the presence of measurement errors.
- Another participant suggests that standard statistics textbooks and tools provide the necessary formulas for least squares approximation.
- It is noted that significant errors in independent variables complicate the regression problem, with total least squares being a potential solution.
- Some participants argue that as long as uncertainties are treated as uncorrelated Gaussians, the problem remains regular, though the sum-square function differs when X measurements have errors.
- Discussion includes the Deming regression and its relation to Principal Component Analysis, with references to implementation in R.
- There are warnings about the misuse of linear regression, particularly in softer disciplines, where low correlation coefficients may be misleading.
- Participants emphasize the importance of considering independent uncertainties in both X and Y when determining the best fit line.
- A function called FITEXY is mentioned as a resource for performing regression analysis in programming contexts.
Areas of Agreement / Disagreement
Participants express differing views on what constitutes a "standard" regression problem and the treatment of measurement errors. There is no consensus on the best approach to take when both variables have errors, indicating ongoing debate and exploration of the topic.
Contextual Notes
Participants highlight the need for careful treatment of measurement errors and the implications for regression analysis, noting that assumptions about error distributions and variances can significantly affect results.