Keeping Randomized Variable in Regression?

  • Context: Graduate 
  • Thread starter Thread starter FallenApple
  • Start date Start date
  • Tags Tags
    Regression Variable
Click For Summary
SUMMARY

The discussion centers on the necessity of including the site indicator in regression models when participants are randomized within treatment sites. According to Homsler and Lemmeshow (2nd ed), the site variable must remain in the regression equation due to the potential for self-selection bias among patients choosing treatment sites. Despite the lack of statistical significance indicated by p-values, the authors argue that the risk of confounding effects from self-selection justifies retaining the site variable in the model. This highlights the importance of understanding the implications of randomization and site selection in regression analysis.

PREREQUISITES
  • Understanding of regression analysis and model building
  • Familiarity with concepts of randomization and confounding variables
  • Knowledge of statistical significance and p-values
  • Awareness of self-selection bias in clinical studies
NEXT STEPS
  • Research the implications of self-selection bias in clinical trials
  • Learn about multivariate regression modeling techniques
  • Study the role of confounding variables in statistical analysis
  • Explore the methodologies outlined in Homsler and Lemmeshow's 2nd edition
USEFUL FOR

Researchers, statisticians, and clinical trial designers who are involved in regression modeling and are interested in understanding the impact of site selection and randomization on study outcomes.

FallenApple
Messages
564
Reaction score
61
So if I have a study where people are randomized within treatment sites do I always have to have the site indicator in the regression equation?

This text(Homsler and Lemmeshow 2nd ed) says yes.

A_REASON_SITE.png

Here is some context provided below.
A_Description_of_Question.png


And here is the output for the multivariate equation after univariate analysis has indicated that how to build the multivariate model

A_Output_Drug_trt.png


Now the usual strategy is simply to drop variables that are not statistically significant.

They said that because SITE is randomized, we cannot drop it. Why? It wasn't explained in the text.
Is it because it is a potential confounder? But clearly from the pval, we don't see it associated with the outcome. So it doesn't seem to confound anything.

What is different about site in this study than about a study simply randomizing something into two treatment groups where there is only one site to begin with ?

What is different about site than a regular indicator variable describing dichotomy? I know it has something to do with the randomization within site, but I don't get it at a gut level
 
Last edited:
Physics news on Phys.org
FallenApple said:
What is different about site in this study than about a study simply randomizing something into two treatment groups where there is only one site to begin with ?
In this experiment the choice of which Site to visit for treatment is made by the patient, so there may be self-selection going on. For instance it may be that patients who are more likely to relapse are more attracted to Site A rather than Site B, or that patients for whom the difference in effectiveness between short and long treatments are more attracted to one site than the other.

In contrast, if the patients had applied for treatment to some central treatment authority, and were randomly allocated by that authority to one of the two sites there would be no self-selection.

I think what H&L are implying is that the potential for self-selection biasing the results between sites is significant enough that one should keep SITE in the model regardless of its p-value.
 
I would have thought that SITE self-selection would cause an artificially high statistical significance, not lessen it. My initial reaction is to leave those variables out of the model.
 

Similar threads

  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 8 ·
Replies
8
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
2K
  • · Replies 11 ·
Replies
11
Views
8K
  • · Replies 2 ·
Replies
2
Views
2K
Replies
13
Views
5K
  • · Replies 6 ·
Replies
6
Views
3K