Is the observed result significantly different from the expected result?

  • Context: Undergrad 
  • Thread starter Thread starter PleaseHelpMe
  • Start date Start date
  • Tags Tags
    Statistics
Click For Summary

Discussion Overview

The discussion revolves around an experiment investigating phonetic symbolism in Japanese words, specifically whether participants could identify correct English translations at a level significantly above chance. The focus includes statistical analysis methods and interpretations of results.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant describes an experiment where 30 participants identified English translations of Japanese words, yielding 705 correct answers out of 1200 attempts, questioning if this is significantly above chance.
  • Another participant suggests that the 705 correct responses represent a 17.5% improvement over the expected average of 600 correct answers, based on a 50% chance assumption.
  • A different participant notes the low probability of achieving such results by random chance, indicating that the likelihood of getting above 700 correct answers in 1200 flips is very low.
  • Concerns are raised about the implications of the results, questioning the source of any bias that might have influenced participants' answers, such as prior knowledge or psychological trends in language.
  • One participant suggests that conducting the experiment multiple times may provide more reliable conclusions and questions if any Japanese words sounded similar to their English translations.
  • A later reply discusses the statistical test used, indicating that the results are significantly above chance, with the actual number of correct responses being about 6 standard deviations above the expected mean under the null hypothesis.

Areas of Agreement / Disagreement

Participants express uncertainty regarding the statistical significance of the results and the implications of the findings. There is no consensus on the conclusions that can be drawn from the experiment, particularly concerning potential biases and the appropriateness of the statistical methods used.

Contextual Notes

Participants highlight limitations in the analysis, including the need for more trials to reach firm conclusions and the potential influence of participant biases on the results.

PleaseHelpMe
Messages
9
Reaction score
0
Anyone good with statistics??

I've conducted an experiment looking at phonetic symbolism in Japanese words.
Thirty participants listened to 40 Japanese words and had to choose out of a choice of two possible English translations which one they thought was the correct answer.

I'm trying to find out whether or not the participants could give the correct answer at an above-chance expectancy level.

Since they didn't know Japanese, we can assume that they had a 50% chance of giving the correct answer (since there were two possible answers for each word- a correct and incorrect one).

There were 705 correct answers out of 1200 possible correct answers (30 x 40).
This is more than half, but is it significantly more than half?


I wasn't sure of the correct statistical test to use, but I used a Chi-square test and followed the instructions on this page: visualstatistics.net/SPSS%20workbook/chi-square_goodness_of_fit.htm

Obviously they use a different example but I'm pretty sure it's the same thing: comparing observed frequencies with expected frequencies.


The results suggested that it was massively significant. However, even if they only got 614 correct and 586 incorrect, it still reports a significant difference (Chi square = .653, Asymp. Sig. = .419).


This doesn't seem right to me, it seems very low.
If I tossed a coin 1200 times and it landed heads 614 times, then that means the coin is weighted??

705/1200 is only 58.75%, and that's significant?

I'm surely doing something wrong..
 
Physics news on Phys.org
I'm sorry it looks like I've posted this in the wrong forum.
 
I am sure you will get a good response here as well. The way I look at it is that assuming 50/50 chance "on average" your test would yield 600 correct responses. You got 705, which is 17.5% better than average (that is ((705-600)/600))

If you had unlimited time I would suggest you try flipping a coin 1200 times and see if you could get 58.75% heads or tails...

I ran a simple simulation of rand head tails flips 1200 times. After 81 experiments I finally had one that had 701 tails and 499 heads.
 
I am sure someone will respond with an exact number (with 1200 flips), but with 120 head tails flips, the probability of getting above 70 of either heads or tails is only 5.477%

With 1200 flips the probability of getting above 700 either heads or tails is less.
 
Unfortunately, even if you figure out the correct way to analyze this to determine whether or not it's statistically significant (it probably is), there is still a problem. What conclusions can you really draw from this?

So you've determined that the people were able to do better than 50%, which is what you'd expect if they really had no indication as to which english word was correct. However, where did this indication come from? Was it that they did in fact know some Japanese? Was it that there is a psychological trend for languages to use similar sounds for similar words? Or was it something about the way you picked the words that the people were quizzed on? (For example, did you take into account the number of cognates and pseudo-cognates in your sample? Did both of the choices for the definition always seem viable?)

---

As an example, let's say that I give you this word and the following choices:
kasa
1: home
2: umbrella

If I were to quiz a completely random group of Spanish speaking people, I'd expect that the majority of people would answer "home". However, if I'd quiz a group of people who happen to know that there are very few similarities between romance languages and Japanese (and know at least a little Spanish), I'd expect that they'd be more likely to answer "umbrella" just because they'd think that the answer would probably not be "home".

The correct answer is of course umbrella.

The problem is that a "random" group of people is not necessarily unbiased. What you have determined is that your group of people is probably slightly biased, but you still don't know where this bias comes from.
 
Last edited:
Have you only done the experiment once? If so it isn't really enough to reach a firm conclusion, I would do it a few more times. Do any of the japanese words sound remotely like their english translation?
 
PleaseHelpMe said:
I wasn't sure of the correct statistical test to use, but I used a Chi-square test and followed the instructions on this page: visualstatistics.net/SPSS%20workbook/chi-square_goodness_of_fit.htm

Obviously they use a different example but I'm pretty sure it's the same thing: comparing observed frequencies with expected frequencies.


The results suggested that it was massively significant. However, even if they only got 614 correct and 586 incorrect, it still reports a significant difference (Chi square = .653, Asymp. Sig. = .419).


This doesn't seem right to me, it seems very low.
If I tossed a coin 1200 times and it landed heads 614 times, then that means the coin is weighted??

705/1200 is only 58.75%, and that's significant?

I'm surely doing something wrong..

The test is binomial (two outcomes). For large n the binomial distribution is very well approximated with a Normal distribution mean np and varience np(1-p).

Taking the null hypothesis that the results are guesses (50/50) then p=0.5 and so the mean is 600 and the varience 300 (stdev = 17.3).

So yes the result is massively significant with the actual number of correct responses about 6 standard deviations above what is expected by chance (Z=6).
 

Similar threads

  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 20 ·
Replies
20
Views
4K