Is the use of Chi-square test appropriate for this experiment?

  • Context: Undergrad 
  • Thread starter Thread starter PleaseHelpMe
  • Start date Start date
  • Tags Tags
    Statistics
Click For Summary
SUMMARY

The discussion centers on the appropriateness of using the Chi-square test for analyzing data from an experiment on phonetic symbolism in Japanese words. The experiment involved 30 participants who answered 40 questions, resulting in 705 correct answers out of 1200 possible. The initial Chi-square test indicated a significant result, but the author questioned its validity due to the low percentage of correct answers (58.75%). They suggested using a standard statistical test for binomial proportion, implemented in R, which showed that a result of 705 correct answers is not significantly different from chance at the 5% level.

PREREQUISITES
  • Understanding of Chi-square tests and their applications
  • Familiarity with binomial distribution and statistical significance
  • Basic knowledge of R programming for statistical analysis
  • Awareness of the laws of large numbers in statistics
NEXT STEPS
  • Learn how to perform binomial tests in R
  • Explore the Chi-square goodness-of-fit test and its assumptions
  • Investigate statistical significance thresholds and their implications
  • Utilize simulation techniques in Matlab or R to validate statistical results
USEFUL FOR

Statisticians, researchers conducting experiments, and students studying statistical analysis methods, particularly in the context of linguistic studies and data interpretation.

PleaseHelpMe
Messages
9
Reaction score
0
Anyone good with statistics??

I've conducted an experiment looking at phonetic symbolism in Japanese words.
Thirty participants listened to 40 Japanese words and had to choose out of a choice of two possible English translations which one they thought was the correct answer.

I'm trying to find out whether or not the participants could give the correct answer at an above-chance expectancy level.

Since they didn't know Japanese, we can assume that they had a 50% chance of giving the correct answer (since there were two possible answers for each word- a correct and incorrect one).

There were 705 correct answers out of 1200 possible correct answers (30 x 40).
This is more than half, but is it significantly more than half?


I wasn't sure of the correct statistical test to use, but I used a Chi-square test and followed the instructions on this page: visualstatistics.net/SPSS%20workbook/chi-square_goodness_of_fit.htm

Obviously they use a different example but I'm pretty sure it's the same thing: comparing observed frequencies with expected frequencies.


The results suggested that it was massively significant. However, even if they only got 614 correct and 586 incorrect, it still reports a significant difference (Chi square = .653, Asymp. Sig. = .419).


This doesn't seem right to me, it seems very low.
If I tossed a coin 1200 times and it landed heads 614 times, then that means the coin is weighted??

705/1200 is only 58.75%, and that's significant?

I'm surely doing something wrong..
 
Physics news on Phys.org
Perform standard statistical test for binomial proportion.
 
I'm not sure whether this is implemented in any of the standard statistical packages, but I write some code in R to calculate the probability of getting any given number right (out of 1200, using the binomial distribution), then had it add up the probability of getting more than 705. It was tiny. In fact, the probability of getting more than 641 is less than 0.01 (significant at 1% level). A result of just 629 would be significant at the all-important 5% level.

When the number of samples is very large, the proportion of coin tosses that are heads tends to 50%, and the probability of ending up a long way off becomes very small (the laws of large numbers). The percentage required to be significant becomes only a little bit more than 50. As an aside, the proportion of human babies that are male, about 51%, is accepted as statistically different from 50% due to the huge amount of data being considered.

If you have Matlab or similar you to do a simulation to convince yourself.

There is an approximation to the binomial test here:

http://www.dimensionresearch.com/resources/calculators/normal.html

If you can use R or S-plus I could give you the code I wrote if you want.
 

Similar threads

  • · Replies 23 ·
Replies
23
Views
3K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 5 ·
Replies
5
Views
4K
  • · Replies 20 ·
Replies
20
Views
4K
  • · Replies 5 ·
Replies
5
Views
9K
  • · Replies 1 ·
Replies
1
Views
5K