What is the concept of geodesics between images in computer vision AI?

  • Thread starter Thread starter FallenApple
  • Start date Start date
  • Tags Tags
    Geodesics Images
Click For Summary
In the discussion, the concept of high-dimensional manifolds in computer vision AI is explored, particularly regarding how images of objects like apples and oranges can be represented as points on such manifolds. The idea of learning a geodesic, or the shortest path between two points on a manifold, raises questions about whether images must be taken together to belong to the same manifold and how distances are measured in this context. The conversation highlights the challenges of using Euclidean distance in sparse data scenarios, suggesting that true distances may be misrepresented when projected onto lower dimensions. Examples, such as the Swiss roll manifold, illustrate how distances can appear misleadingly close in lower dimensions while being much greater in the manifold's higher-dimensional space. The analogy of stars in a constellation is used to explain how perceived distances can differ from actual distances when considering additional dimensions. The discussion also touches on the importance of local connectivity in data points and how this can help approximate distances along the manifold, particularly in dynamic scenarios like video sequences.
FallenApple
Messages
564
Reaction score
61
I was told by someone that for computer vision AI, a photo of say an apple and an orange exists on some high dimensional manifold, and the goal is to learn a geodesic between the two objects.

What does this mean? Does this mean that the photo of one of the images is just a tuple of coordinates? Do the objects need to be in the same photo to be in the same manifold? Or can they be different photos?

Do we connect the geodesic along patches of space? But most of the space is empty since the data is discrete and sparse.
 
  • Like
Likes Telemachus
Technology news on Phys.org
This is a very fuzzy question. Can you provide some reference to look at instead of a someone told me reference?

My closest thought would be a neural net to discriminate between the two objects which mathematically would be represented by a matrix in computer code. However, I can't see how a geodesic is involved in any way.

I found this reference but it's for geodesics on a topological map which makes a lot more sense.

https://www.mathworks.com/matlabcen...esic-distance-between-two-points-on-an-image?
 
  • Like
Likes Telemachus
I think the idea has to do with the distribution of data. If the data is on a manifold, then eucildean distance goes through the space that the manifold is embedded in and not on the manifold itself, which is problematic.

The topic is manifold learning.

I think there's an example of distance on a Swiss roll manifold on page 9 of the attachment.

Also, here is an image. The Euclidean distance is misleading on the roll because within the manifold, the distance is much greater. Projecting the points downwards, the blue seems close to the red, but in the non projected higher dimensional space of the manifold, the red geodesic is the correct path. But I just don't know what those points mean. I was thinking that images that are dissimilar have greater geodesics between them.

SwissRoll.png


Also, the image is from this website.
http://www.indiana.edu/~dll/B657/B657_lec_isomap_lle.pdf
 

Attachments

Last edited:
You could compare it to the stars in a constellation. When you view the stars, any two stars may appear very close especially if they are similar in brightness as you are using an angular measurement to judge the distance. However once you consider the radial distance in 3D space one star may be extremely bright and extremely far away.

In a counter example, two stars may appear very distant from an angular point of view but still be in the same cluster of stars simply because the cluster is closer to us.

Here's an animation using the constellation of Orion:



The starting view is Orion as seen from the Earth and the animation then rotates the constellation around 90 degrees and now you can see the true distances involved.

In data mining, distances are used to group data points together so adding one more dimension to your data might bring together two points that seemed very distant before. In the example, you gave you are moving from the manifold's way of measuring distance to a different mapping and different metric to measure the distance which brings two points a lot closer together.

The best analogy I can think of is comparing customers using income, one making $100K/yr and another making $50K/yr. By that measurement alone they seem very far away.

However if we now consider cost of living, the $100K person might live in NYC where a lot of his/her income is needed to survive whereas the $50K/yr person living in a smaller city may not have those expenses at least to that degree and so their incomes when adjusted might be $30K (from $10K) and $20K (from $50K) and now you see their true buying power and its much closer.
 
Last edited:
  • Like
Likes FallenApple and Telemachus
I think the idea is that if you have a set of images that are connected locally (like the sequence of images of the golfer, where you know the order of the images but don't necessarily know how many frames lie between two images in general) and can extract some parameters from them (e.g. the angle of the club and the rotation of the upper body) then you have a network in the parameter space. If it's reasonably friendly network (the points that are directly connected to any given one are close to it, things like that) then it is reasonable to approximate the points as lying in a smooth manifold embedded in the parameter space. Then you can estimate the distance between any two pictures in terms of distances along the manifold instead of finding your way through the network.

You can just use Euclidean distance in the parameter space, but that fails for cases like the golf swing where "lining up the shot" and "club hits the ball" are quite similar, but separated in truth by a great big loop of pictures of backswing and drive. The manifold in this case is the line through the parameters extracted from each picture in sequence.

That's how I read those notes, anyway. I'm not an expert or anything.
 
  • Like
Likes FallenApple, Telemachus and jedishrfu
I think @Ibix got it right here. I hadn't considered the notion of comparing successive frames in a video as points in a manifold but that makes a lot of sense.
 
  • Like
Likes Telemachus
I tried a web search "the loss of programming ", and found an article saying that all aspects of writing, developing, and testing software programs will one day all be handled through artificial intelligence. One must wonder then, who is responsible. WHO is responsible for any problems, bugs, deficiencies, or whatever malfunctions which the programs make their users endure? Things may work wrong however the "wrong" happens. AI needs to fix the problems for the users. Any way to...

Similar threads

Replies
5
Views
1K
Replies
8
Views
5K
Replies
5
Views
2K
  • · Replies 6 ·
Replies
6
Views
3K
Replies
10
Views
5K
  • · Replies 8 ·
Replies
8
Views
4K
Replies
1
Views
5K
Replies
29
Views
3K
  • · Replies 43 ·
2
Replies
43
Views
12K
  • · Replies 12 ·
Replies
12
Views
3K