# How to calculate the real size of an object from image

• B
Hello everyone, I have been doing researches on this, but those topics I found are a bit different from what I am doing,

I am trying to determine the actual size of an object in image/video (CCTV video tracks), my thinking is, if this object is moving, the size appears on the screen changes, and depending on the position of this object (its distance from the camera), the size changing rate should be different, vice versa, if the object image size changing rate is known, it should be possible to deduce its actual position and actual size? If this is not right, "put a reference object in the scene" method is also acceptable.

However, the current problem is, I do not really understand the 3D real world projecting to 2D image geometrical math, perhaps with a reference object, some mapping equation f(x,y)→g(x_real, y_real, z_real) can be found? Or with the changing rate way, f(dr/dt, x, y)→g(x_real, y_real, z_real) is possible?

Please, any fundamental ideas, corrections or suggestions will be appreciated, thanks!

If you have two view points then you can determine the distance between the view point and the object, such as in stereo vision.

https://en.wikipedia.org/wiki/Triangulation

You require however two view points and a known distance between them. For example our eyes are two view points with a distance of 2 inches in between them. From there you can determine the other lengths of the triangle and size of the object.

I am trying to determine the actual size of an object in image/video (CCTV video tracks)

For clarification, is the lens a fixed focal length or is it a zoom?

my thinking is, if this object is moving, the size appears on the screen changes, and depending on the position of this object (its distance from the camera), the size changing rate should be different, vice versa, if the object image size changing rate is known, it should be possible to deduce its actual position and actual size

That sounds like it might only work if the velocity of the object is constant. If it's close by and moving slowly, it's image size will change slowly, similar to it being far away and moving quickly.

From a still image, you convert perspective projection to orthographic projection, and you'd also need at least one object in the image of known length.