Normalized correlation with a constant vector

In summary, the question is raised about the interpretation of a normalized correlation with a constant vector, which can result in an undefined value. This can be a concern in image processing when trying to match a patch to other patches in a photo. One possible solution is to compute the differences between the pixels in the patch and its adjacent pixels, treating them as the data to be matched. This can be done with a smaller sample of pixels to reduce computational costs.
  • #1
daviddoria
97
0
I am confused how to interpret the result of preforming a normalized correlation with a constant vector. Since you have to divide by the standard devation of both vectors (reference: http://en.wikipedia.org/wiki/Cross-correlation#Normalized_cross-correlation ) , if one of them is constant (say a vector of all 5's, which has standard deviation=0), then the correlation is infinity, but in fact the correlation should be zero right? This isn't just a corner case, in general if the standard deviation of one of the vectors is small, the correlation to any other vector is very high. Can anyone explain my misinterpretation?

Thanks,

David
 
Mathematics news on Phys.org
  • #2
daviddoria said:
if one of them is constant (say a vector of all 5's, which has standard deviation=0), then the correlation is infinity, but in fact the correlation should be zero right?
In that case, the expression for correlation takes the form 0/0, so you can't say it is infinity.

You raise an interesting question. It is important in practical applications of image processing. It's also a question about pure mathematics, but in that that respect it's more of a nitpicking detail.

In pure mathematics, perhaps some statistics texts define a value for the correlation in this case, but unless a special definition is given, all you can say about the mathematical expression is that it is undefined.

(If anyone wishes to delve into this technicality, we should begin by making a distinction among three distinct topics: covariance of two random variables, sample covariance, and estimator(s) of covariance. Things that are properties of samples (e.g. their variance) have somewhat arbitrary definitions (e.g. do we compute variance by dividing by N or N-1? ) and different books define them differently. Things that are properties of random variables and estimators of parameters have standard definitions, but I don't know if they are standardized in dealing with all the exceptional situations.)

As a practical concern, I think you are worried that if you have image patch A and are trying to match it to other image patches in a photo, that it may have a large correlation to a nearly constant patch B, which it does not resemble. As far as I know, that might happen. Expressions that approach 0/0 can take large or small values depending on how they approach it.

I'm sure your next question is whether there is some modification of the correlation formula that would produce a function that would avoid this problem. Off hand, I don't know of one. I'll have to think about it.
 
  • #3
Stephen Tashi said:
As a practical concern, I think you are worried that if you have image patch A and are trying to match it to other image patches in a photo, that it may have a large correlation to a nearly constant patch B, which it does not resemble. As far as I know, that might happen.

Yes, that is exactly what I am observing.

I'm sure your next question is whether there is some modification of the correlation formula that would produce a function that would avoid this problem. Off hand, I don't know of one. I'll have to think about it.

You got it :)
 
  • #4
For each interior pixel in a patch at location (i,j), you can compute the difference (in each of the RGB intensities) between it and the 8 adjacent pixels. You could treat these differences as the data to be matched and match it with the cross-correlation function. That way a uniform patch wouldn't match well with a patch that had more variation. (This can be regarded as a special case of my suggestion in the other thread about using transformations. In this case the transformations are displacement of the patch by 1 pixel.)

If you want to cut the computational costs, you could only use the differences from a sample of pixels in the patch. Perhaps it wouldn't even have to be a truly random sample. You might be able to pick a subset of pixels that were in a deterministic but "random looking" pattern.
 
  • #5


The result of performing a normalized correlation with a constant vector can be confusing because, as you mentioned, dividing by the standard deviation of a constant vector will result in an infinite correlation. However, this does not mean that the correlation is actually infinite or that the correlation should be zero. The issue here is that when one of the vectors has a standard deviation of 0, it means that all of its values are the same, but it does not necessarily mean that those values are 0. Therefore, the correlation will still be high when compared to other vectors.

In general, when calculating correlation, it is important to consider the variability of the data. A constant vector has no variability, so it will always result in a high correlation with any other vector. This does not necessarily mean that there is a strong relationship between the two vectors. It is important to also consider the context and the data itself when interpreting correlation results.

Additionally, it may be helpful to consider other measures of correlation, such as Spearman's rank correlation or Kendall's tau, which are less affected by constant values. These measures rank the data rather than using the actual values, making them more robust to extreme values or constant values.

Overall, the interpretation of correlation results should not solely rely on the standard deviation of the vectors involved. It is important to consider the data and the context in order to accurately interpret the relationship between the variables being compared.
 

What is normalized correlation with a constant vector?

Normalized correlation with a constant vector is a statistical method used to measure the strength and direction of the linear relationship between two variables, where one variable is a constant value. It is a variation of the standard correlation coefficient, but it takes into account the presence of a constant vector in the data.

How is normalized correlation with a constant vector calculated?

The formula for normalized correlation with a constant vector is similar to the standard correlation coefficient, except that the constant value is subtracted from each data point before calculating the correlation. This ensures that the constant does not artificially inflate the correlation value.

What does a high normalized correlation with a constant vector indicate?

A high normalized correlation with a constant vector indicates a strong positive or negative linear relationship between the two variables, where the constant value is acting as a constant offset. This means that as one variable increases or decreases, the other variable follows a similar pattern.

What are the limitations of using normalized correlation with a constant vector?

Normalized correlation with a constant vector assumes a linear relationship between the variables and may not accurately capture non-linear relationships. Additionally, it may be affected by outliers and the choice of the constant value used in the calculation.

How is normalized correlation with a constant vector used in scientific research?

Normalized correlation with a constant vector is used to analyze and understand the relationship between variables in various fields such as economics, psychology, and biology. It can also be used as a pre-processing step in machine learning algorithms to remove the effects of a constant value on the data.

Similar threads

  • Quantum Interpretations and Foundations
2
Replies
54
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
Replies
92
Views
4K
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
2K
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • High Energy, Nuclear, Particle Physics
Replies
11
Views
1K
  • Special and General Relativity
Replies
8
Views
2K
Replies
7
Views
755
Back
Top