Stephen Tashi said:
It sounds like what you want is a method of matching patches that is based on some fundamental mathematical principle and which will work on matching any sort of patch to another patch representing the same physical phenomena. I don't think there is any such mathematical prinicple. If there were, writing image recognition software would be no problem.
Yep, that's what I want :) Of course there isn't a perfect one, but I thought someone here might have an idea of a better one than a simple Euclidean distance.
I don't think there is any way of avoiding studying the properties of the particular phenomena ( grass, sides of houses, etc.) that your are trying to match.
For example, I'd think that matching techniques must depend on the image resolution. One way I can visualize an image of a patch of grass is a high resolution picture, sharp enough to show the individual stems of grass and the very dark shadows that surround them at their base. Another way I can visualize an image of a patch of grass is at a low resolution where a patch of grass looks like blobs of colors and the pixels represent an averaging together of the colors of the stems of grass and the shadows at their base. It would surprise me if two high resolution pictures of different patches of grass matched in a Euclidean metric since one might have a dark shadow in a pixel where the other had a stem. It is plausible that two low resolution images would match since each pixel is an average of colors representing different things.
Assume everything is quite low resolution. For instance, this picture of the entire house/grass/road is ~500x500.
Based on my own experience with images (they were most often grayscale images), a physical object like a patch of grass can produce a wide range of pixel statistics depending on the resolution of the image, the direction of the sun when the picture is taken, whether the patch of grass is in the shade of a tree or not, etc. So I hope you are only trying to match patches of grass within the same picture, not patches between two pictures taken on different days, in different weather etc.
Yes, only grass from the same image at the same time of day should match with itself (this is not a symantic labeling problem, just a straight patch matching problem).
Motivated only by my own interest in groups and transformations, I have the following thoughts. An image of a small portion of the side of a house could be turned upside down or sideways and it still might look like the side of a house. Is this true of patches of grass, to the resolution that you are dealing with them? Let's suppose it isn't.
Some patches (e.g. grass in the middle of the yard) should match to other patches (more grass from the middle of the yard) no matter which way they are turned - grass is grass - we can call these "uniform patches" (the full patch is the same texture). However, if you take a patch that straddles the grass and the sidewalk for example, it very much has an orientation (it should only match patches with the grass on the same side of both patches).
In general, suppose that I define a small set of different transformations of a patch of pixels, T1, T2, T3, ..Tn They could be simple like "flip upside down", "rotate 90 deg clockwise" or they could be more complicated algorithms.
For a given patch A, I compute the Euclidean (or other metric) between A and itself after it is transformed by this set of transformations and a get a set of n "distances" (D1,D2,D3,D4,..Dn), I think of this as a vector or function f_A(n) that has n values. View the problem of comparing patch A to patch B as the problem of comparing their respective "self-match" functions, f_A and f_B. There are various way to compare functions. I don't see any way to say a priori which one's might work. This process makes no direction comparison between the pixels of A and the pixels of B.
I see what you're saying, but doesn't this still have pretty much the same problem? If you flip a patch vertically and compare it pixel-wise to its old self, won't that have just as much chance for "statistical accidents" as comparing two different patches pixel-wise? It still seems to be that there should be some distance function that doesn't treat the whole set of pixels as a very rigid/fixed set of values, but rather has some looser interpretation in that the pixels are random variables with some distribution. I.e. a black pixel really shouldn't match at all to a white pixel, but a green pixel should pretty much match to a slightly different green-ish pixel. Of course the Euclidean distance captures that last bit, but is very unforgiving. I could imagine something like "for each pixel, compute the difference as the min(difference to each of the pixels in a 3x3 region around the corresponding pixel in the other image)". This makes it spatially more forgiving, but this massively compounds the computational cost of computing such a value.
Or maybe just a min(EuclideanDistance, Constant) type of function? (A "truncated quadratic" type of idea)?
Again, I'm just rambling, hoping that we can scrounge up some keywords that will get me thinking in a different direction :)
David