Dismiss Notice
Join Physics Forums Today!
The friendliest, high quality science and math community on the planet! Everyone who loves science is here!

Chi square problem

  1. Jul 23, 2007 #1
    Hi ll, I have been asked to review a new image retrieval technique that has improved the retrieval performance compared with the old technique, ie, the hypothesis is that retrieval performance is influenced by retrieval techniques. The following table summarises the retrieval results. (They examined the first 300 retrieved images, 250 are relevant to the query for the new technique and 175 are relevant for the old technique)

    Image relevant? Type of techniques Total
    New technique Old technique
    Yes 250 175 425
    No 50 125 175
    Total 300 300 600

    I was using this distribution at http://davidmlane.com/hyperstat/chi_square_table.html but I am having trouble calculating the Chi Square value assuming the required p value is 0.01 and interpreting the result. Would someone be able to do this and explain how it was doing, or point to a decent tutorial? I only need this done, not to understand it, at least not past what is required.

    Many thanks,
  2. jcsd
  3. Jul 23, 2007 #2


    User Avatar
    Science Advisor
    Homework Helper

    Last edited: Jul 23, 2007
  4. Jul 24, 2007 #3
    I dont understand that at all...
  5. Jul 24, 2007 #4


    User Avatar
    Science Advisor
    Homework Helper

    You are evaluating the "null" hypothesis "technique does not matter" against alternative hypo. "technique makes a difference." To evaluate it, you need to calculate a test statistic, then compare it against the chi-square table; where degrees of freedom = 1 and level of significance = 0.01. Unless I made a mistake, the corresponding value on the chi-square table is 6.6349.

    If the result of your calculation:

    (actual - predicted)^2/predicted, summed over the two categories

    turns out to be > 6.635, then the test is telling you that there is a significant difference between the actual "scores" and the predicted "scores," at the 0.01 level of significance. You should reject the hypothesis that "the technique does not make difference" if your calculation is > 6.635.

    In your case:

    ________Category 1___Category 2
    actual = ____ 250 ______ 175
    predicted = 425/2 ______ 425/2

    (I have assumed that only the first row of your table is relevant for this calculation.)

    Do you see the similarity between the test-stat. in your case and the example test stat. (comparing men and women) on the Wikipedia page?
    Last edited: Jul 24, 2007
  6. Jul 25, 2007 #5
    But you do not know the exact frequency of the old technique. So I would suggest to tests :
    H0 : there is no difference between the two techniques
    As you observe a large number of events, you can assume gaussian distribution and use a Pearson chi2 test like :

    chi2 = (250 - 175)^2/(250 + 175) = 13.2
  7. Jul 25, 2007 #6


    User Avatar
    Science Advisor
    Homework Helper

    (a - b)^2/(a + b) = [(a - y)^2 + (b - y)^2]/y where y = (a + b)/2, therefore Barmecides's formula will produce an identical result with the one I posted (which is identical to the formula on the Wikipedia page).
    Last edited: Jul 25, 2007
Share this great discussion with others via Reddit, Google+, Twitter, or Facebook