Reviewing Image Retrieval Performance Using Chi-Square

chisqaw · Jul 23, 2007

Hi ll, I have been asked to review a new image retrieval technique that has improved the retrieval performance compared with the old technique, ie, the hypothesis is that retrieval performance is influenced by retrieval techniques. The following table summarises the retrieval results. (They examined the first 300 retrieved images, 250 are relevant to the query for the new technique and 175 are relevant for the old technique)

Image relevant? Type of techniques Total
New technique Old technique
Yes 250 175 425
No 50 125 175
Total 300 300 600

I was using this distribution at http://davidmlane.com/hyperstat/chi_square_table.html but I am having trouble calculating the Chi Square value assuming the required p value is 0.01 and interpreting the result. Would someone be able to do this and explain how it was doing, or point to a decent tutorial? I only need this done, not to understand it, at least not past what is required.

Many thanks,
Lawrence

EnumaElish · Jul 23, 2007

http://en.wikipedia.org/wiki/Pearson's_chi-square_test#Example

Reject "no effect" hypothesis if computed chi-square > 6.6349.

chisqaw · Jul 24, 2007

I don't understand that at all...

EnumaElish · Jul 24, 2007

You are evaluating the "null" hypothesis "technique does not matter" against alternative hypo. "technique makes a difference." To evaluate it, you need to calculate a test statistic, then compare it against the chi-square table; where degrees of freedom = 1 and level of significance = 0.01. Unless I made a mistake, the corresponding value on the chi-square table is 6.6349.

If the result of your calculation:

(actual - predicted)^2/predicted, summed over the two categories

turns out to be > 6.635, then the test is telling you that there is a significant difference between the actual "scores" and the predicted "scores," at the 0.01 level of significance. You should reject the hypothesis that "the technique does not make difference" if your calculation is > 6.635.

In your case:

________Category 1___Category 2
actual = ____ 250 ______ 175
predicted = 425/2 ______ 425/2

(I have assumed that only the first row of your table is relevant for this calculation.)

Do you see the similarity between the test-stat. in your case and the example test stat. (comparing men and women) on the Wikipedia page?

Barmecides · Jul 25, 2007

EnumaElish said:

You are evaluating the "null" hypothesis "technique does not matter" against alternative hypo. "technique makes a difference." To evaluate it, you need to calculate a test statistic, then compare it against the chi-square table; where degrees of freedom = 1 and level of significance = 0.01. Unless I made a mistake, the corresponding value on the chi-square table is 6.6349.

If the result of your calculation:

(actual - predicted)^2/predicted, summed over the two categories

turns out to be > 6.635, then the test is telling you that there is a significant difference between the actual "scores" and the predicted "scores," at the 0.01 level of significance. You should reject the hypothesis that "the technique does not make difference" if your calculation is > 6.635.

In your case:

________Category 1___Category 2
actual = ____ 250 ______ 175
predicted = 425/2 ______ 425/2

(I have assumed that only the first row of your table is relevant for this calculation.)

Do you see the similarity between the test-stat. in your case and the example test stat. (comparing men and women) on the Wikipedia page?

But you do not know the exact frequency of the old technique. So I would suggest to tests :
H0 : there is no difference between the two techniques
As you observe a large number of events, you can assume gaussian distribution and use a Pearson chi2 test like :

chi2 = (250 - 175)^2/(250 + 175) = 13.2

EnumaElish · Jul 25, 2007

(a - b)^2/(a + b) = [(a - y)^2 + (b - y)^2]/y where y = (a + b)/2, therefore Barmecides's formula will produce an identical result with the one I posted (which is identical to the formula on the Wikipedia page).

Reviewing Image Retrieval Performance Using Chi-Square

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Expected numbers of cards of a last color remaining

Undergrad The problem of points

Graduate Probability puzzle

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect