Test of hypothesis for independent samples

chwala · Dec 12, 2022

Reference;

https://www.statisticshowto.com/probability-and-statistics/t-test/

My question is, can we as well have 'subtract each ##x## score from each ##y## score?' thanks.
...t-tests after all are easy to comprehend; as long as one knows the types;
i.e
1. indepedent sample tests (compares means btwn groups)
2. Paired sample(mean from same group comaprison) &
3. One sample

then you good to go... then a matter of understanding Dof and alpha level as compared to calculated value to ascertain any of the given hypothesis questions.

mjc123 · Dec 13, 2022

Yes, you can. See the note under step 8.

chwala · Dec 13, 2022

Why does the author indicate that if you do not have a specified alpha value then use ##5\%##, any specific/particular reason? Why not ##2\%## or ##10\%## in that matter i.e reference step##7##.

WWGD · Dec 14, 2022

For the paired t-test, from:
https://www.statisticshowto.com/probability-and-statistics/t-test/ ( My bold)

"When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test

Choose the paired t-test if you have two measurements on the same item, person or thing. But you should also choose this test if you have two items that are being measured with a unique condition. For example, you might be measuring car safety performance in vehicle research and testing and subject the cars to a series of crash tests. Although the manufacturers are different, you might be subjecting them to the same conditions.

With a “regular” two sample t test, you’re comparing the means for two different samples. For example, you might test two different groups of customer service associates on a business-related test or testing students from two universities on their English skills. But if you take a random sample each group separately and they have different conditions, your samples are independent and you should run an independent samples t test (also called between-samples and unpaired-samples).The null hypothesis for the independent samples t-test is μ1 = μ2. So it assumes the means are equal. With the paired t test, the null hypothesis is that the pairwise difference between the two tests is equal (H0: µd = 0). "

An issue, point I think is interesting here is that this technique is used in classification schemes. Tw o objects are in the same class if the variability between them is within a limited range, and in different classes otherwise. As in: How/When do we declare two dogs are of the same breed?

WWGD · Dec 14, 2022

chwala said:

Why does the author indicate that if you do not have a specified alpha value then use ##5\%##, any specific/particular reason? Why not ##2\%## or ##10\%## in that matter i.e reference step##7##.

It's become something of a standard, through no specific reason I'm aware of. This has given rise to criticism on the basis of the choice/number being arbitrary. There's been some discusion to include effect size in such tests in part for this reason: If, say, the difference in outcome of two medicines is significant at some level, but the effect size is of minor, then you might not care as much. Meaning that one medicine will reduce duration of a cold by 3 days ( if untreated), while the other one will reduce duration by 4 days, then significance itself , is not of much value.

mfb · Dec 15, 2022

chwala said:

Why does the author indicate that if you do not have a specified alpha value then use ##5\%##, any specific/particular reason? Why not ##2\%## or ##10\%## in that matter i.e reference step##7##.

It is a common convention in many fields. It's bad. The original idea sounds good: If you test one hypothesis in your study, then on average only 1 in 20 studies with no effect will falsely call something significant. In practice people rarely have just a single precisely defined hypothesis and don't correct their analyses properly for that. To make things worse many journals don't want to publish null results, giving scientists an even larger incentive to dig up something they can call significant, and also making it impossible to see how many studies are done in total. As a result, we get tons of "significant" results that are just random fluctuations. You can see it in distributions of p-values. The range *just* below 0.05 is more common than we should expect, nicely shown in this plot (z-values, so p=0.05 corresponds to z=1.96)

Significance isn't the interesting property anyway. If your sample size is large enough you'll always find a significant effect for essentially everything - that doesn't mean it's relevant. If option 1 reduces some risk by 2% +-0.5% (p<0.001 for having an effect) and option 2 reduces the risk by 40% +- 21% (p>0.05), which option do you prefer? The second one, of course - despite the smaller significance it's far more likely to help a lot, while the first option is a certain but minimal reduction.
More studies should report effect sizes instead of focusing on arbitrary "significance" points.

WWGD · Dec 15, 2022

mfb said:

It is a common convention in many fields. It's bad. The original idea sounds good: If you test one hypothesis in your study, then on average only 1 in 20 studies with no effect will falsely call something significant. In practice people rarely have just a single precisely defined hypothesis and don't correct their analyses properly for that. To make things worse many journals don't want to publish null results, giving scientists an even larger incentive to dig up something they can call significant, and also making it impossible to see how many studies are done in total. As a result, we get tons of "significant" results that are just random fluctuations. You can see it in distributions of p-values. The range *just* below 0.05 is more common than we should expect, nicely shown in this plot (z-values, so p=0.05 corresponds to z=1.96)

Significance isn't the interesting property anyway. If your sample size is large enough you'll always find a significant effect for essentially everything - that doesn't mean it's relevant. If option 1 reduces some risk by 2% +-0.5% (p<0.001 for having an effect) and option 2 reduces the risk by 40% +- 21% (p>0.05), which option do you prefer? The second one, of course - despite the smaller significance it's far more likely to help a lot, while the first option is a certain but minimal reduction.
More studies should report effect sizes instead of focusing on arbitrary "significance" points.

mfb,is this the main issue in terms of the problem of replicability/reproducibility of results in the social sciences? Is it a problem throughout all sciences, rather than just social sciences?

mfb · Dec 17, 2022

I think if people would focus more on effect sizes and confidence intervals and be fine with publishing null results we would reduce the problem and increase reproducibility a lot. Reproduction would be results consistent within the uncertainties.

That's another issue of a binary "significant"/"not significant" classification. If one study claims it's significant (odds ratio 1.35, p=0.02, 95% CI from 1.05 to 1.65)) and a similar study says it's not (odds ratio 1.25, p=0.10, 95% CI from 0.95 to 1.55), do they disagree? Of course not - they are within one standard deviation of each other. But they seem to say very different things.

There are fields that do it much better and particle physics is among them. Most studies are repetitions (typically but not always with better precision), most results of searches are null results (which do get published without issue), failing to reproduce previous measurements is very rare even for the measurements which measure a non-zero value.

Test of hypothesis for independent samples

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Attachments

"When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test

Similar threads

Hi! Can someone explain about Differential Equations?

Deriving spatial derivatives

Is this the correct general solution of the given PDE?

What does "compute Aut(G)" mean?

J_1(x) = (x^2/10)*(J_1(x) + J_3(x)) How to solve?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Test of hypothesis for independent samples

Homework Help Overview

Discussion Character

Approaches and Questions Raised

Discussion Status

Contextual Notes

Attachments

"When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test​

Similar threads

"When to Choose a Paired T Test / Paired Samples T Test / Dependent Samples T Test