Reviewing Image Retrieval Performance Using Chi-Square

  • Context: Undergrad 
  • Thread starter Thread starter chisqaw
  • Start date Start date
  • Tags Tags
    Chi Chi square Square
Click For Summary

Discussion Overview

The discussion revolves around the evaluation of a new image retrieval technique's performance compared to an old technique, specifically using the Chi-Square test to analyze the relevance of retrieved images. Participants explore the calculation of the Chi-Square value and its interpretation within the context of hypothesis testing.

Discussion Character

  • Technical explanation
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • Lawrence presents a table summarizing retrieval results and seeks assistance in calculating the Chi-Square value and interpreting it.
  • One participant references a Wikipedia page to provide a threshold for rejecting the null hypothesis, indicating a Chi-Square value greater than 6.6349 is necessary.
  • Another participant explains the process of evaluating the null hypothesis that the technique does not matter, detailing how to calculate the test statistic and its comparison to the Chi-Square table.
  • There is a suggestion that the calculation should focus on the first row of the table, with a proposed method for determining the predicted values.
  • A later reply introduces an alternative calculation method, suggesting the use of a Gaussian distribution assumption due to the large number of events, leading to a different Chi-Square value of 13.2.
  • One participant discusses a formula that produces results consistent with those previously mentioned, indicating a shared understanding of the mathematical approach.

Areas of Agreement / Disagreement

Participants express differing views on the calculation methods and interpretations of the Chi-Square test, with no consensus reached on the correct approach or final value.

Contextual Notes

There are limitations regarding the exact frequency of the old technique, which may affect the calculations. Additionally, assumptions about distributions and the relevance of specific categories in the calculations are not fully resolved.

chisqaw
Messages
2
Reaction score
0
Hi ll, I have been asked to review a new image retrieval technique that has improved the retrieval performance compared with the old technique, ie, the hypothesis is that retrieval performance is influenced by retrieval techniques. The following table summarises the retrieval results. (They examined the first 300 retrieved images, 250 are relevant to the query for the new technique and 175 are relevant for the old technique)


Image relevant? Type of techniques Total
New technique Old technique
Yes 250 175 425
No 50 125 175
Total 300 300 600


I was using this distribution at http://davidmlane.com/hyperstat/chi_square_table.html but I am having trouble calculating the Chi Square value assuming the required p value is 0.01 and interpreting the result. Would someone be able to do this and explain how it was doing, or point to a decent tutorial? I only need this done, not to understand it, at least not past what is required.

Many thanks,
Lawrence
 
Physics news on Phys.org
I don't understand that at all...
 
You are evaluating the "null" hypothesis "technique does not matter" against alternative hypo. "technique makes a difference." To evaluate it, you need to calculate a test statistic, then compare it against the chi-square table; where degrees of freedom = 1 and level of significance = 0.01. Unless I made a mistake, the corresponding value on the chi-square table is 6.6349.

If the result of your calculation:

(actual - predicted)^2/predicted, summed over the two categories

turns out to be > 6.635, then the test is telling you that there is a significant difference between the actual "scores" and the predicted "scores," at the 0.01 level of significance. You should reject the hypothesis that "the technique does not make difference" if your calculation is > 6.635.

In your case:

________Category 1___Category 2
actual = ____ 250 ______ 175
predicted = 425/2 ______ 425/2

(I have assumed that only the first row of your table is relevant for this calculation.)

Do you see the similarity between the test-stat. in your case and the example test stat. (comparing men and women) on the Wikipedia page?
 
Last edited:
EnumaElish said:
You are evaluating the "null" hypothesis "technique does not matter" against alternative hypo. "technique makes a difference." To evaluate it, you need to calculate a test statistic, then compare it against the chi-square table; where degrees of freedom = 1 and level of significance = 0.01. Unless I made a mistake, the corresponding value on the chi-square table is 6.6349.

If the result of your calculation:

(actual - predicted)^2/predicted, summed over the two categories

turns out to be > 6.635, then the test is telling you that there is a significant difference between the actual "scores" and the predicted "scores," at the 0.01 level of significance. You should reject the hypothesis that "the technique does not make difference" if your calculation is > 6.635.

In your case:

________Category 1___Category 2
actual = ____ 250 ______ 175
predicted = 425/2 ______ 425/2

(I have assumed that only the first row of your table is relevant for this calculation.)

Do you see the similarity between the test-stat. in your case and the example test stat. (comparing men and women) on the Wikipedia page?

But you do not know the exact frequency of the old technique. So I would suggest to tests :
H0 : there is no difference between the two techniques
As you observe a large number of events, you can assume gaussian distribution and use a Pearson chi2 test like :

chi2 = (250 - 175)^2/(250 + 175) = 13.2
 
(a - b)^2/(a + b) = [(a - y)^2 + (b - y)^2]/y where y = (a + b)/2, therefore Barmecides's formula will produce an identical result with the one I posted (which is identical to the formula on the Wikipedia page).
 
Last edited:

Similar threads

  • · Replies 8 ·
Replies
8
Views
4K