Comparing means from SAME sample

  • Context: Undergrad 
  • Thread starter Thread starter HPR
  • Start date Start date
  • Tags Tags
    Means
Click For Summary
SUMMARY

This discussion focuses on statistical significance in preference studies using a sample size of 40 participants. The author seeks clarification on determining if one option (A) is preferred over others (B and C) with a 98% confidence level, as well as how to compare two variables (X and Y) from the same sample. The analysis suggests using hypothesis testing to evaluate both majority and plurality preferences, specifically employing binomial distributions to maximize p-values for confidence in the results.

PREREQUISITES
  • Understanding of hypothesis testing, including null and alternative hypotheses.
  • Familiarity with binomial distributions and p-value calculations.
  • Knowledge of confidence intervals and their construction.
  • Experience with statistical software for data analysis, such as R or Python.
NEXT STEPS
  • Learn about hypothesis testing for proportions using binomial distributions.
  • Explore the concept of p-value maximization in statistical analysis.
  • Study how to construct and interpret confidence intervals for means.
  • Investigate pairwise comparison techniques for dependent samples.
USEFUL FOR

Researchers, statisticians, and data analysts involved in preference studies or comparative analysis of variables within the same sample.

HPR
Messages
1
Reaction score
0
I've done an experiment with human participants, n=40, so it's a large sample n>=30 from a large population. Now I'm trying to understand the statistical significance of my results and verify that I am interpreting things properly. This type of study is NOT my area of expertise.

I'm basically looking for clarification on the following points, for somebody to read what I've done and give me a reason why it's OK or why it's not OK. I appreciate the help so much.

1)

First off, if I've asked participants which of A,B or C do they "most prefer", and I have gotten results back along the lines of 68%, 20%, 12% respectively. I can then say pretty easily that "with a 98% confidence level, A will be the most preferred for the majority of the population", using z scores. This I am confident about. But can I say, with some even greater level of confidence, that A will be most preferred by more of the population than either B or C (but not necessarily the majority of the population)? A plurality of the population will prefer A, basically. Is that possible, and how would I do that?

2)

Now let's say I've asked participants to rank preferences with numerical values, so I've had 40 people give me there "rankings" from 0-7 for X and Y. And so I can have an average an standard deviation for X and Y values. And I can construct confidence intervals, where we can say with 95% confidence that X will be within some interval around X, and same idea with Y.

So, for instance:

X = 80 +/- 10
Y = 50 +/- 15

with 95% confidence level.

Then, can I say that because 50+15=65 and 80-10=70, that I have 95% confidence that X>Y? I want to say this, but I think I can't.

When I google comparing means of samples, they talk about having two different populations, and they usually talk about sampling the same variable from two different populations. What I want to do is compare two different variables, but taken from the same sample. How do I do that?

My only idea is to subtract pairwise the value of every X,Y obtained from the participants. So if my data was:

Particpants 1 2 3 4
X = 10 11 12 14
Y = 11 10 14 12
X-Y= -1 1 -2 2

I could then take the average and standard deviation of X-Y, and if the confidence interval I constructed did not contain 0, then I could say X>Y with x degree of confidence. Is this the case?



Thanks so, so much to anyone that can point me in the right direction here!
 
Physics news on Phys.org
HPR said:
But can I say, with some even greater level of confidence, that A will be most preferred by more of the population than either B or C (but not necessarily the majority of the population)?

Good question. I'm not sure if this is how others would do it, but I'd approach the majority & plurality questions via p-value maximization.

If we write the majority question in terms null/alternative hypotheses

H0: pA<=1/2
H1: pA>1/2

with p-value p=Prob(N>=27) where N is Binomial(40,pA), then the p-value is maximized at 0.02 under H0 with pA=1/2 - so your reasoning so far seems ok.

For the plurality question if we use the null/alternative hypotheses

H0: pA<=pB or pA<=pC
H1: pA>pB and pA>pC

with p=Prob(N>=27) as before, and again this value is maximized under H0 with pA=1/2 and pB=1/2 and pC=0 (or pB=0 and pC=1/2) - but this seems to imply that we are no more confident of the plurality than the majority.

Maybe someone can suggest a way to include the proportions that prefer B or C in the hypothesis tests?
 

Similar threads

  • · Replies 1 ·
Replies
1
Views
1K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 22 ·
Replies
22
Views
4K
  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 9 ·
Replies
9
Views
2K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K