What is the concept of geodesics between images in computer vision AI?

In summary, a person was telling me that there is a photo of an apple and orange that exists on a higher dimensional manifold. The goal is to learn a geodesic between the two objects. This is a very fuzzy question.
  • #1
FallenApple
566
61
I was told by someone that for computer vision AI, a photo of say an apple and an orange exists on some high dimensional manifold, and the goal is to learn a geodesic between the two objects.

What does this mean? Does this mean that the photo of one of the images is just a tuple of coordinates? Do the objects need to be in the same photo to be in the same manifold? Or can they be different photos?

Do we connect the geodesic along patches of space? But most of the space is empty since the data is discrete and sparse.
 
  • Like
Likes Telemachus
Technology news on Phys.org
  • #2
This is a very fuzzy question. Can you provide some reference to look at instead of a someone told me reference?

My closest thought would be a neural net to discriminate between the two objects which mathematically would be represented by a matrix in computer code. However, I can't see how a geodesic is involved in any way.

I found this reference but it's for geodesics on a topological map which makes a lot more sense.

https://www.mathworks.com/matlabcen...esic-distance-between-two-points-on-an-image?
 
  • Like
Likes Telemachus
  • #3
I think the idea has to do with the distribution of data. If the data is on a manifold, then eucildean distance goes through the space that the manifold is embedded in and not on the manifold itself, which is problematic.

The topic is manifold learning.

I think there's an example of distance on a Swiss roll manifold on page 9 of the attachment.

Also, here is an image. The Euclidean distance is misleading on the roll because within the manifold, the distance is much greater. Projecting the points downwards, the blue seems close to the red, but in the non projected higher dimensional space of the manifold, the red geodesic is the correct path. But I just don't know what those points mean. I was thinking that images that are dissimilar have greater geodesics between them.

SwissRoll.png


Also, the image is from this website.
http://www.indiana.edu/~dll/B657/B657_lec_isomap_lle.pdf
 

Attachments

  • ThompsonDimensionalityReduction.pdf
    787 KB · Views: 226
Last edited:
  • #4
You could compare it to the stars in a constellation. When you view the stars, any two stars may appear very close especially if they are similar in brightness as you are using an angular measurement to judge the distance. However once you consider the radial distance in 3D space one star may be extremely bright and extremely far away.

In a counter example, two stars may appear very distant from an angular point of view but still be in the same cluster of stars simply because the cluster is closer to us.

Here's an animation using the constellation of Orion:



The starting view is Orion as seen from the Earth and the animation then rotates the constellation around 90 degrees and now you can see the true distances involved.

In data mining, distances are used to group data points together so adding one more dimension to your data might bring together two points that seemed very distant before. In the example, you gave you are moving from the manifold's way of measuring distance to a different mapping and different metric to measure the distance which brings two points a lot closer together.

The best analogy I can think of is comparing customers using income, one making $100K/yr and another making $50K/yr. By that measurement alone they seem very far away.

However if we now consider cost of living, the $100K person might live in NYC where a lot of his/her income is needed to survive whereas the $50K/yr person living in a smaller city may not have those expenses at least to that degree and so their incomes when adjusted might be $30K (from $10K) and $20K (from $50K) and now you see their true buying power and its much closer.
 
Last edited:
  • Like
Likes FallenApple and Telemachus
  • #5
I think the idea is that if you have a set of images that are connected locally (like the sequence of images of the golfer, where you know the order of the images but don't necessarily know how many frames lie between two images in general) and can extract some parameters from them (e.g. the angle of the club and the rotation of the upper body) then you have a network in the parameter space. If it's reasonably friendly network (the points that are directly connected to any given one are close to it, things like that) then it is reasonable to approximate the points as lying in a smooth manifold embedded in the parameter space. Then you can estimate the distance between any two pictures in terms of distances along the manifold instead of finding your way through the network.

You can just use Euclidean distance in the parameter space, but that fails for cases like the golf swing where "lining up the shot" and "club hits the ball" are quite similar, but separated in truth by a great big loop of pictures of backswing and drive. The manifold in this case is the line through the parameters extracted from each picture in sequence.

That's how I read those notes, anyway. I'm not an expert or anything.
 
  • Like
Likes FallenApple, Telemachus and jedishrfu
  • #6
I think @Ibix got it right here. I hadn't considered the notion of comparing successive frames in a video as points in a manifold but that makes a lot of sense.
 
  • Like
Likes Telemachus

1. What are geodesics between images?

Geodesics between images are the shortest paths between two images on a surface. These paths are determined using mathematical concepts such as curvature and distance, and they can be used to compare and analyze images.

2. How are geodesics between images calculated?

Geodesics between images are calculated using geometric algorithms that take into account the features and characteristics of the images, as well as the surface on which they are located. These algorithms use principles of differential geometry and optimization to determine the most efficient path between the images.

3. What is the significance of geodesics between images?

Geodesics between images have various applications in fields such as computer vision, image processing, and computer graphics. They can be used for tasks such as image registration, object tracking, and image morphing. They also provide a way to measure the similarity between images.

4. Can geodesics between images be used for 3D images?

Yes, geodesics between images can be used for 3D images as well. In this case, the images are located on a 3D surface and the geodesics are calculated in the 3D space. This allows for more accurate and realistic comparisons between 3D images.

5. Are there any limitations to geodesics between images?

One limitation of geodesics between images is that they may not always accurately represent the visual similarity between images. This is because they are based on mathematical calculations and may not take into account factors such as lighting, color, and texture. Additionally, the accuracy of geodesics can be affected by noise and distortions in the images.

Similar threads

  • Programming and Computer Science
Replies
5
Views
718
  • Computing and Technology
Replies
0
Views
196
Replies
10
Views
2K
  • Programming and Computer Science
Replies
5
Views
1K
  • Special and General Relativity
Replies
8
Views
2K
Replies
1
Views
4K
Replies
29
Views
3K
  • Astronomy and Astrophysics
2
Replies
43
Views
10K
Back
Top