# Having trouble understanding a dimension reduction step in a matrix.

This is dealing with computer vision, but the only part I'm having trouble understanding is a step in the matrix math. So it seems appropriate it should go here.

The paper/chapter I'm reading takes one of those steps saying "from this we can easily derive this" and I'm not quite sure what happened. Perhaps it's just been too long since I've done significant linear algebra.

Basically we have the follow:

The relation from a 3D point to an image point is given by:
$$\left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} M_{ext} \left( \begin{array}{ccc} X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right)$$
Where:
$$M_{ext} = \left[ \begin{array}{ccc} r_{11} & r_{12} & r_{13} & T_{x} \\ r_{21} & r_{22} & r_{23} & T_{y} \\ r_{31} & r_{32} & r_{33} & T_{z} \end{array} \right]$$
However, the equation for a plane give us:
$$Z_{w} = d - n_{x} X_{w} - n_{y} Y_{w}$$
The above equation can be reduced to:
$$\left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} \left[ \begin{array}{ccc} r_{11}-n_{x}r_{13} & r_{12}-n_{y}r_{13} & dr_{13}+T_{x} \\ r_{21}-n_{x}r_{23} & r_{22}-n_{y}r_{23} & dr_{23}+T_{y} \\ r_{31}-n_{x}r_{33} & r_{32}-n_{y}r_{33} & dr_{33}+T_{z} \end{array} \right] \left( \begin{array}{ccc} X_w \\ Y_w \\ 1 \end{array} \right)$$

So my question is, how is the equation of the plane applied to the 4x3 matrix to reduce to a 3x3 matrix. I can see what happened but I don't understand fully how it happened. If someone could explain that part it would be great.

Thank you for your time!

## Answers and Replies

Stephen Tashi
Science Advisor
I can see what happened but I don't understand fully how it happened.

I feel the same way! If someone can explain this nicely in terms of abstract things like subspaces etc. , then it would be more pleasing. In anticipation of that, it would be useful to make the example less particular to the subject of computer vision.

To use a lower dimensional example, if we are given that $z = x + y$
then the mapping defined with a 2x3 matrix:

$$\begin{pmatrix} a_{11}&a_{12} & a_{13} \\ a_{21}&a_{22}&a_{23} \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix}$$

is equivalent to the mapping defined with a 2x2 matrix:

$$\begin{pmatrix} a_{11} + a_{13} & a_{12} + a_{13} \\ a_{21} + a_{23} & a_{22} + a_{23} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}$$

Yes, that would be a more simple equivalent problem. Again, I can clearly see what's happening, but why are you allowed to do that? Thanks for this much at least.

Stephen Tashi
Science Advisor
why are you allowed to do that?

You can always replace one expression by another expression that is alebraically identical, so it's not a question of being allowed or not. The question is "How would I know when to use that trick?".

If y = Ax is a linear mapping from an n dimensional space to a space of smaller dimension m then when can we write it as y = Bw where w is an m dimensional column vector?

If we claim Ax = Bw, we must be establishing some relation between x and w. In the examples, the use of common variables make that evident, but, abstractly, what is the relation between x and w. ?

I suspect it's one of those "commutative diagram" things.

I'm sorry, when I said "Why are you allowed to do that?" I didn't mean mean the same thing you think I did. Then again, I could have worded that much better. My question is not so much "How would I know when to use that trick?", but instead "What is that trick?". Why are these two mappings equivalent? It's clear that each of the elements corresponding to x and y had the related element for z added on. However, what I don't understand is why these mappings are equivalent. Not only why is this example equivalent, but also, what is the general way to see/determine that two mappings are equivalent?

Thanks again for your time thus far!

chiro
Science Advisor
This is dealing with computer vision, but the only part I'm having trouble understanding is a step in the matrix math. So it seems appropriate it should go here.

The paper/chapter I'm reading takes one of those steps saying "from this we can easily derive this" and I'm not quite sure what happened. Perhaps it's just been too long since I've done significant linear algebra.

Basically we have the follow:

The relation from a 3D point to an image point is given by:
$$\left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} M_{ext} \left( \begin{array}{ccc} X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right)$$
Where:
$$M_{ext} = \left[ \begin{array}{ccc} r_{11} & r_{12} & r_{13} & T_{x} \\ r_{21} & r_{22} & r_{23} & T_{y} \\ r_{31} & r_{32} & r_{33} & T_{z} \end{array} \right]$$
However, the equation for a plane give us:
$$Z_{w} = d - n_{x} X_{w} - n_{y} Y_{w}$$
The above equation can be reduced to:
$$\left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} \left[ \begin{array}{ccc} r_{11}-n_{x}r_{13} & r_{12}-n_{y}r_{13} & dr_{13}+T_{x} \\ r_{21}-n_{x}r_{23} & r_{22}-n_{y}r_{23} & dr_{23}+T_{y} \\ r_{31}-n_{x}r_{33} & r_{32}-n_{y}r_{33} & dr_{33}+T_{z} \end{array} \right] \left( \begin{array}{ccc} X_w \\ Y_w \\ 1 \end{array} \right)$$

So my question is, how is the equation of the plane applied to the 4x3 matrix to reduce to a 3x3 matrix. I can see what happened but I don't understand fully how it happened. If someone could explain that part it would be great.

Thank you for your time!

Hey jenny_shoars and welcome to the forum.

In terms of your 4x4 matrix which includes X,Y,Z,W (where W is usually considered to be 1 but not always in computer graphics/computer vision) the basic idea for going from that 4x4 to 3x3 is the fact what we are doing is incorporating the information of translating the point before you do a matrix operation in that particular transformation.

For the reason of translation of a point, we need to include a 4th dimension W because a normal 3x3 transformation will only scale and rotate a point and not shift it by a fixed point. To understand think in terms of a normal vector V and a transformation M where V' = MV.

In terms of the standard non-projection type operators, our usual atomic operators involve rotation, scaling, and translation. Rotation operators are generalized to the angle-axis method and for practical purposes these often are calculated after a quaternionic calculation by first working in quaternionic space and then going to euclidean matrix space. The reasons for this include things like the gimbal lock problem and the ability to do very complex interpolation techniques with quaternions that involve multiple rotations.

So basically the reason why we have a 4x4 is because we want to have an operation that add's a Tx for x, Ty for y and Tz for z. Lets consider our V = [x,y,z,1] and the first line of M to be [a b c d], then our x term for V' = ax + by + cd + Tx.

Now we could incorporate this into a 3x3, but you have to remember that the general pipeline is created so that we optimize the matrix routines for 4x4 and then just use the library code to do general compositions of 4x4 matrices.

It means that when we end up designing simulation engines that have hierarchical coordinate systems, as well as all the stuff that goes with this (collision detection, physics calculations, etc) then it makes sense to use a highly optimized library that can do things really fast on 4x4 matrices and that is what we do.

Stephen Tashi
Science Advisor
what I don't understand is why these mappings are equivalent. Not only why is this example equivalent, but also, what is the general way to see/determine that two mappings are equivalent?

I misused the word "equivalent"! I regard "mapping" as a synonym for "function" and (in the usual way of defining "equivalent" for functions), the two mappings (Ax and Bw, in my example) are not equivalent because they have different domains.

The words that your example uses are that "the above equation can be reduced to". There is a sense in which the two mappings are "the same", but I shouldn't have claimed that they are "equivalent" in the normal sense.

That's why we need some person steeped in abstraction to explain this. (Chiro, like myself, is a practical person. He didn't show us any commutative diagrams.)

Do you already know the abstract side of linear algebra? Do you know stuff like:

When you have a linear mapping between vector spaces, y = Ax, it has a "kernel", which is the set of vectors in the domain that are mapped to zero in the image. One can prove that the kernel of the mapping is a vector subspace of the domain. If y = Ax is a mapping from a space of higher dimension to a space of lower dimension, one can prove that the dimension of the kernel is equal to the number of dimensions that are 'lost" when we do the mapping.