A Observing interactions with plots using est. coeff.

  • A
  • Thread starter Thread starter FallenApple
  • Start date Start date
  • Tags Tags
    Interactions Plots
FallenApple
Messages
564
Reaction score
61
This question has two parts. On for the linear case, and one for the logistic case. Say X is a continuous variable and we want to see how x affects the response when looking between two different groups. Say G1=Group1, G2=Group2

In linear regression, we can plot the regression lines using the estimated coefficients to see if there is an interaction between two different groups. If they are parallel, then that suggests interaction, if they are not, then that suggests the opposite. Then I would check the p value of the coefficient to see if this is really the case.

Is that true? If it is, then why even plot using the estimated coefficients? The p values should be enough. if p!= 0 for the wald test for the interaction term, then there is insufficient evidence for interaction. Is it because if p=0, we still want to see just how much interaction there is? But wouldn't the absolute value of the interaction coeff be a good hint. Or do we still need visualization?What about for logistic regression. So I look at the probability curve, P[Y=1|X,G1] and P[Y=1|X,G2]. If the difference . delta =P[Y=1|X,G1] - P[Y=1|X,G2] is a constant at each X, then does that mean there is no interaction? Is this like the linear case?
 
Physics news on Phys.org
FallenApple said:
Then I would check the p value of the coefficient to see if this is really the case.

Which coefficient are you talking about? - and how do you arrive at a p-value for it?
 
It sounds like what the individual is doing is running p models for each category of interaction that the model may have. Then comparing the results of the model by their coefficients, and then derive a conclusion via the p-values. This is not the right approach. You need to instead find he p-value for the difference between the models. There's a standard error estimated between each model type, and it's very possible to get differences at each observed point but for the differences to not be statistically significant.

Wald's test would only be valid if the models are supgroups of each other. (Although don't quote me on that.)

Lastly relying on just p-values is never a good idea. If you can visualize your data, then do it. When your sample is large, nearly everything rejects the null hypothesis.
 
MarneMath said:
It sounds like what the individual is doing is running p models for each category of interaction that the model may have. Then comparing the results of the model by their coefficients, and then derive a conclusion via the p-values. This is not the right approach. You need to instead find he p-value for the difference between the models. There's a standard error estimated between each model type, and it's very possible to get differences at each observed point but for the differences to not be statistically significant.

Wald's test would only be valid if the models are supgroups of each other. (Although don't quote me on that.)

Lastly relying on just p-values is never a good idea. If you can visualize your data, then do it. When your sample is large, nearly everything rejects the null hypothesis.

Ok so basically do a likelihood ratio test between the two models right?

Also, why is visualizing data better than just getting p values from regression models? Is it because visualization looks at the data as it is? So if there is a way to perfectly visualize the data, when we would not need to do the regression at all?
 
As I stated, as your sample size increases, then most statistical test will reject the null hypothesis. Statistical test were designed to be rather sensitive. They weren't meant for millions upon millions of data points. Thus often times, for example, you'll reject Shapiro test, but if you look at the data, it's normal enough. You can even take smaller sub such that 99 times the Shapiro test fails to reject, but if you take the entire sample, it rejects.

Therefore, if possible, it's always good to look at your data and not rely on just statistical test.
 
Hi all, I've been a roulette player for more than 10 years (although I took time off here and there) and it's only now that I'm trying to understand the physics of the game. Basically my strategy in roulette is to divide the wheel roughly into two halves (let's call them A and B). My theory is that in roulette there will invariably be variance. In other words, if A comes up 5 times in a row, B will be due to come up soon. However I have been proven wrong many times, and I have seen some...
Thread 'Detail of Diagonalization Lemma'
The following is more or less taken from page 6 of C. Smorynski's "Self-Reference and Modal Logic". (Springer, 1985) (I couldn't get raised brackets to indicate codification (Gödel numbering), so I use a box. The overline is assigning a name. The detail I would like clarification on is in the second step in the last line, where we have an m-overlined, and we substitute the expression for m. Are we saying that the name of a coded term is the same as the coded term? Thanks in advance.
Back
Top