Inverse perspective mapping equations

AI Thread Summary
The discussion centers on creating a transformation matrix to map real-world coordinates (X, Y) to pixel coordinates (xp, yp) for a robot's camera detecting a red ball. The camera's parameters, including focal length, height, and tilt angle, are known, but the user struggles with existing equations and the need for recalibration if the camera angle changes. Suggestions include using a calibration grid to empirically determine the mapping and accounting for lens distortion, particularly barrel distortion. It is advised to explore resources like Born and Wolf's "Principles of Optics" for understanding lens effects on perspective. The conversation emphasizes the importance of developing an automated calibration procedure to simplify adjustments and improve accuracy.
darie2808
Messages
2
Reaction score
0
Hi . I am making a robot that is supposed to detect a red ball with a camera, and then know where the ball is .
The robot has a camera on top of it, that is tilted at an angle ; At each frame the camera detects the red ball and returns 2 coordinates , xp and yp which are the pixel coordinates of the middle of the ball on the image captured .
What i want is to map each point on the ground to a point in the image. In other words to change the perspective from birds eye view to a 2d top view of the ground in front of the robot.
Everything about the camera is known : focal length , the angle at which is tilted, the height at which it is , the horisontal and vertical apperture angles...everything.
All i need is the formula that relates X,Y of the real world to the xp and yp .
I tried several equations that i found online but although all are similar, they are different and none worked .
Note the (0,0) point on the image is in the middle of it.
Help would be much appreciated!
Best wishes ,
Darie
 
Physics news on Phys.org
I've worked with a group that had to solve the same problem. I didn't work on that part of the project but my recollection is that for the most part what we did was to mark off the floor in a grid pattern. We then calibrated the camera by taking a picture of the grid on the floor and numerically working out the mapping. That may be one of the better ways of doing it (particularly if you end up changing cameras or simply adjusting the rig as time goes by). This worked quite well actually because at the time we were making a point of using very cheap web cameras which presented an appreciable amount of distortion.

Now if you wish do this purely from theory, I think the first starting point is coming up with a mapping for a perfect lens. That is, the first thing you need is to account for the visual projection at the focal point of the floor space. This would only depend upon the height of the camera. You would probably then have to derive a projection for the distortion that is caused by the lens (this is could be captured empirically like we did as I described above). My guess is that you would most likely need to accommodate just for simple barrel distortion. Once you come up with these projections you would simply need to apply the inverse projections on the resulting capture image (image * inverse barrel distortion * inverse perspective projection).
 
Thank you for your answer! We might do the same as you did . However our camera has very low resolution (174*144) . Hope it will work . But there is a problem with that, if we change the angle of the camera, all the mapping will have to be done again ... Right? I might have missunderstood you though , since you said that that changing the rig is not a problem .
What are the parameters that we really need ? Focal length , height , camera angle and xp and yp right? Do we need anything else? Like the aperture angle? or something like that.
All the equations we encountered required different parameters.
Again , what I need is the 'transformation matrix ' such that :
(X,Y) = transformation matrix * (xp,yp).
DO you know where I could find this transformation matrix?
 
Well ideally you would just write an automated procedure to calibrate the rig. That way, you can setup your calibration grid on the floor and then just run the calibration procedure any time you wish. With such a program you would not need to input the parameters like focal length, height, etc. Plus one would expect some slight deviation from the ideal when put into practice anyway and thus there should be some kind of empircal calibration in the end.

As for the transformation matrix, no I don't know how to do that with a lens. I know how to do that assuming that you do not have a lens easily enough but I do not remember enough of my perspective and projection lessons to recall how you would accommodate the added refraction of a lens between the picture plane or focal point and the object. The problem is that the ray tracing involves skew lines instead of lines that are parallel to the axis. You can probably find more about this in Born and Wolf's "Principles of Optics" text. I believe they cover the behavior of skew lines in 4.9. But you should be able to find the transformation for basic distortions easily enough.

Either way, I would take a look at Born and Wolf's text or even take a look at some engineering drafting textbooks. Techniques on how to do technical perspective drawings may be more applicable in this case since you want to do a rough ray trace based method.

EDIT: Off-hand here is one way how I would think of the problem but I do not know if it is right. I would just do a projection drawing of the grid given the height and angle of the camera and the distance from the grid. I would do a projection onto a picture plane that is of the focal length away from the focal point and the picture plane being situated in a plane dictated by the angle of the camera. I'm not sure if this is correct but it sounds reasonable to me. This would give you the basic projection results to which the lens aberrations and distortions would be applied to.
 
This was my idea on how it is probably done but I'm not sure.
 

Attachments

  • method.jpg
    method.jpg
    33.1 KB · Views: 765
Hello! Let's say I have a cavity resonant at 10 GHz with a Q factor of 1000. Given the Lorentzian shape of the cavity, I can also drive the cavity at, say 100 MHz. Of course the response will be very very weak, but non-zero given that the Loretzian shape never really reaches zero. I am trying to understand how are the magnetic and electric field distributions of the field at 100 MHz relative to the ones at 10 GHz? In particular, if inside the cavity I have some structure, such as 2 plates...
Back
Top