Normalized correlation with a constant vector

Click For Summary
The discussion centers on the confusion surrounding the interpretation of normalized correlation when one vector is constant, leading to an undefined correlation value. It highlights that if one vector has a standard deviation of zero, the correlation calculation results in a 0/0 form, which is mathematically undefined rather than infinite. This issue is particularly relevant in image processing, where a nearly constant patch could yield misleadingly high correlation with other patches. Suggestions are made to modify the correlation formula or use pixel differences to improve matching accuracy. The conversation emphasizes the need for careful consideration of statistical definitions and practical implications in real-world applications.
daviddoria
Messages
96
Reaction score
0
I am confused how to interpret the result of preforming a normalized correlation with a constant vector. Since you have to divide by the standard devation of both vectors (reference: http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation ) , if one of them is constant (say a vector of all 5's, which has standard deviation=0), then the correlation is infinity, but in fact the correlation should be zero right? This isn't just a corner case, in general if the standard deviation of one of the vectors is small, the correlation to any other vector is very high. Can anyone explain my misinterpretation?

Thanks,

David
 
Mathematics news on Phys.org
daviddoria said:
if one of them is constant (say a vector of all 5's, which has standard deviation=0), then the correlation is infinity, but in fact the correlation should be zero right?
In that case, the expression for correlation takes the form 0/0, so you can't say it is infinity.

You raise an interesting question. It is important in practical applications of image processing. It's also a question about pure mathematics, but in that that respect it's more of a nitpicking detail.

In pure mathematics, perhaps some statistics texts define a value for the correlation in this case, but unless a special definition is given, all you can say about the mathematical expression is that it is undefined.

(If anyone wishes to delve into this technicality, we should begin by making a distinction among three distinct topics: covariance of two random variables, sample covariance, and estimator(s) of covariance. Things that are properties of samples (e.g. their variance) have somewhat arbitrary definitions (e.g. do we compute variance by dividing by N or N-1? ) and different books define them differently. Things that are properties of random variables and estimators of parameters have standard definitions, but I don't know if they are standardized in dealing with all the exceptional situations.)

As a practical concern, I think you are worried that if you have image patch A and are trying to match it to other image patches in a photo, that it may have a large correlation to a nearly constant patch B, which it does not resemble. As far as I know, that might happen. Expressions that approach 0/0 can take large or small values depending on how they approach it.

I'm sure your next question is whether there is some modification of the correlation formula that would produce a function that would avoid this problem. Off hand, I don't know of one. I'll have to think about it.
 
Stephen Tashi said:
As a practical concern, I think you are worried that if you have image patch A and are trying to match it to other image patches in a photo, that it may have a large correlation to a nearly constant patch B, which it does not resemble. As far as I know, that might happen.

Yes, that is exactly what I am observing.

I'm sure your next question is whether there is some modification of the correlation formula that would produce a function that would avoid this problem. Off hand, I don't know of one. I'll have to think about it.

You got it :)
 
For each interior pixel in a patch at location (i,j), you can compute the difference (in each of the RGB intensities) between it and the 8 adjacent pixels. You could treat these differences as the data to be matched and match it with the cross-correlation function. That way a uniform patch wouldn't match well with a patch that had more variation. (This can be regarded as a special case of my suggestion in the other thread about using transformations. In this case the transformations are displacement of the patch by 1 pixel.)

If you want to cut the computational costs, you could only use the differences from a sample of pixels in the patch. Perhaps it wouldn't even have to be a truly random sample. You might be able to pick a subset of pixels that were in a deterministic but "random looking" pattern.
 
Here is a little puzzle from the book 100 Geometric Games by Pierre Berloquin. The side of a small square is one meter long and the side of a larger square one and a half meters long. One vertex of the large square is at the center of the small square. The side of the large square cuts two sides of the small square into one- third parts and two-thirds parts. What is the area where the squares overlap?

Similar threads

Replies
1
Views
2K
  • · Replies 54 ·
2
Replies
54
Views
6K
  • · Replies 58 ·
2
Replies
58
Views
5K
Replies
3
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 11 ·
Replies
11
Views
2K
Replies
2
Views
2K
  • · Replies 3 ·
Replies
3
Views
3K
Replies
92
Views
8K
  • · Replies 2 ·
Replies
2
Views
3K