Having trouble understanding a dimension reduction step in a matrix.

Click For Summary

Discussion Overview

The discussion revolves around understanding a specific dimension reduction step in matrix mathematics as applied in computer vision. Participants are exploring the relationship between 3D points and their corresponding image points, particularly how an equation representing a plane can be used to simplify a 4x3 matrix into a 3x3 matrix.

Discussion Character

  • Technical explanation
  • Conceptual clarification
  • Debate/contested

Main Points Raised

  • One participant describes a transformation from a 3D point to an image point using a matrix equation, expressing confusion about a reduction step involving a plane equation.
  • Another participant suggests that a simpler, abstract example could help clarify the concept of dimension reduction in mappings.
  • Some participants express a desire to understand the underlying principles that justify the equivalence of different mappings and transformations.
  • A later reply discusses the importance of including translation in matrix operations and how this relates to the use of a 4x4 matrix in computer vision.
  • There is a discussion about the terminology used, with one participant clarifying their use of "equivalent" in the context of mappings and functions.

Areas of Agreement / Disagreement

Participants express a shared confusion about the reduction process and the equivalence of mappings, but there is no consensus on the underlying principles or the specific conditions that allow for such reductions.

Contextual Notes

Participants note that the discussion involves complex concepts in linear algebra and matrix transformations, with some assumptions about the nature of mappings and their domains remaining unresolved.

jenny_shoars
Messages
20
Reaction score
0
This is dealing with computer vision, but the only part I'm having trouble understanding is a step in the matrix math. So it seems appropriate it should go here.

The paper/chapter I'm reading takes one of those steps saying "from this we can easily derive this" and I'm not quite sure what happened. Perhaps it's just been too long since I've done significant linear algebra.

Basically we have the follow:

The relation from a 3D point to an image point is given by:
[tex] \left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} M_{ext} \left( \begin{array}{ccc} X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right)[/tex]
Where:
[tex] M_{ext} = \left[ \begin{array}{ccc} r_{11} & r_{12} & r_{13} & T_{x} \\ r_{21} & r_{22} & r_{23} & T_{y} \\ r_{31} & r_{32} & r_{33} & T_{z} \end{array} \right][/tex]
However, the equation for a plane give us:
[tex] Z_{w} = d - n_{x} X_{w} - n_{y} Y_{w}[/tex]
The above equation can be reduced to:
[tex] \left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} \left[ \begin{array}{ccc} r_{11}-n_{x}r_{13} & r_{12}-n_{y}r_{13} & dr_{13}+T_{x} \\ r_{21}-n_{x}r_{23} & r_{22}-n_{y}r_{23} & dr_{23}+T_{y} \\ r_{31}-n_{x}r_{33} & r_{32}-n_{y}r_{33} & dr_{33}+T_{z} \end{array} \right] \left( \begin{array}{ccc} X_w \\ Y_w \\ 1 \end{array} \right)[/tex]

So my question is, how is the equation of the plane applied to the 4x3 matrix to reduce to a 3x3 matrix. I can see what happened but I don't understand fully how it happened. If someone could explain that part it would be great.

Thank you for your time!
 
Physics news on Phys.org
jenny_shoars said:
I can see what happened but I don't understand fully how it happened.

I feel the same way! If someone can explain this nicely in terms of abstract things like subspaces etc. , then it would be more pleasing. In anticipation of that, it would be useful to make the example less particular to the subject of computer vision.

To use a lower dimensional example, if we are given that [itex]z = x + y[/itex]
then the mapping defined with a 2x3 matrix:

[tex]\begin{pmatrix} a_{11}&a_{12} & a_{13} \\ a_{21}&a_{22}&a_{23} \end{pmatrix} \begin{pmatrix} x \\ y \\ z \end{pmatrix}[/tex]

is equivalent to the mapping defined with a 2x2 matrix:

[tex]\begin{pmatrix} a_{11} + a_{13} & a_{12} + a_{13} \\ a_{21} + a_{23} & a_{22} + a_{23} \end{pmatrix} \begin{pmatrix} x \\ y \end{pmatrix}[/tex]
 
Yes, that would be a more simple equivalent problem. Again, I can clearly see what's happening, but why are you allowed to do that? Thanks for this much at least.
 
jenny_shoars said:
why are you allowed to do that?

You can always replace one expression by another expression that is alebraically identical, so it's not a question of being allowed or not. The question is "How would I know when to use that trick?".

If y = Ax is a linear mapping from an n dimensional space to a space of smaller dimension m then when can we write it as y = Bw where w is an m dimensional column vector?

If we claim Ax = Bw, we must be establishing some relation between x and w. In the examples, the use of common variables make that evident, but, abstractly, what is the relation between x and w. ?

I suspect it's one of those "commutative diagram" things.
 
I'm sorry, when I said "Why are you allowed to do that?" I didn't mean mean the same thing you think I did. Then again, I could have worded that much better. My question is not so much "How would I know when to use that trick?", but instead "What is that trick?". Why are these two mappings equivalent? It's clear that each of the elements corresponding to x and y had the related element for z added on. However, what I don't understand is why these mappings are equivalent. Not only why is this example equivalent, but also, what is the general way to see/determine that two mappings are equivalent?

Thanks again for your time thus far!
 
jenny_shoars said:
This is dealing with computer vision, but the only part I'm having trouble understanding is a step in the matrix math. So it seems appropriate it should go here.

The paper/chapter I'm reading takes one of those steps saying "from this we can easily derive this" and I'm not quite sure what happened. Perhaps it's just been too long since I've done significant linear algebra.

Basically we have the follow:

The relation from a 3D point to an image point is given by:
[tex] \left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} M_{ext} \left( \begin{array}{ccc} X_w \\ Y_w \\ Z_w \\ 1 \end{array} \right)[/tex]
Where:
[tex] M_{ext} = \left[ \begin{array}{ccc} r_{11} & r_{12} & r_{13} & T_{x} \\ r_{21} & r_{22} & r_{23} & T_{y} \\ r_{31} & r_{32} & r_{33} & T_{z} \end{array} \right][/tex]
However, the equation for a plane give us:
[tex] Z_{w} = d - n_{x} X_{w} - n_{y} Y_{w}[/tex]
The above equation can be reduced to:
[tex] \left( \begin{array}{ccc} x_1 \\ x_2 \\ x_3 \end{array} \right) = M_{int} \left[ \begin{array}{ccc} r_{11}-n_{x}r_{13} & r_{12}-n_{y}r_{13} & dr_{13}+T_{x} \\ r_{21}-n_{x}r_{23} & r_{22}-n_{y}r_{23} & dr_{23}+T_{y} \\ r_{31}-n_{x}r_{33} & r_{32}-n_{y}r_{33} & dr_{33}+T_{z} \end{array} \right] \left( \begin{array}{ccc} X_w \\ Y_w \\ 1 \end{array} \right)[/tex]

So my question is, how is the equation of the plane applied to the 4x3 matrix to reduce to a 3x3 matrix. I can see what happened but I don't understand fully how it happened. If someone could explain that part it would be great.

Thank you for your time!

Hey jenny_shoars and welcome to the forum.

In terms of your 4x4 matrix which includes X,Y,Z,W (where W is usually considered to be 1 but not always in computer graphics/computer vision) the basic idea for going from that 4x4 to 3x3 is the fact what we are doing is incorporating the information of translating the point before you do a matrix operation in that particular transformation.

For the reason of translation of a point, we need to include a 4th dimension W because a normal 3x3 transformation will only scale and rotate a point and not shift it by a fixed point. To understand think in terms of a normal vector V and a transformation M where V' = MV.

In terms of the standard non-projection type operators, our usual atomic operators involve rotation, scaling, and translation. Rotation operators are generalized to the angle-axis method and for practical purposes these often are calculated after a quaternionic calculation by first working in quaternionic space and then going to euclidean matrix space. The reasons for this include things like the gimbal lock problem and the ability to do very complex interpolation techniques with quaternions that involve multiple rotations.

So basically the reason why we have a 4x4 is because we want to have an operation that add's a Tx for x, Ty for y and Tz for z. Let's consider our V = [x,y,z,1] and the first line of M to be [a b c d], then our x term for V' = ax + by + cd + Tx.

Now we could incorporate this into a 3x3, but you have to remember that the general pipeline is created so that we optimize the matrix routines for 4x4 and then just use the library code to do general compositions of 4x4 matrices.

It means that when we end up designing simulation engines that have hierarchical coordinate systems, as well as all the stuff that goes with this (collision detection, physics calculations, etc) then it makes sense to use a highly optimized library that can do things really fast on 4x4 matrices and that is what we do.
 
jenny_shoars said:
what I don't understand is why these mappings are equivalent. Not only why is this example equivalent, but also, what is the general way to see/determine that two mappings are equivalent?

I misused the word "equivalent"! I regard "mapping" as a synonym for "function" and (in the usual way of defining "equivalent" for functions), the two mappings (Ax and Bw, in my example) are not equivalent because they have different domains.

The words that your example uses are that "the above equation can be reduced to". There is a sense in which the two mappings are "the same", but I shouldn't have claimed that they are "equivalent" in the normal sense.

That's why we need some person steeped in abstraction to explain this. (Chiro, like myself, is a practical person. He didn't show us any commutative diagrams.)

Do you already know the abstract side of linear algebra? Do you know stuff like:

When you have a linear mapping between vector spaces, y = Ax, it has a "kernel", which is the set of vectors in the domain that are mapped to zero in the image. One can prove that the kernel of the mapping is a vector subspace of the domain. If y = Ax is a mapping from a space of higher dimension to a space of lower dimension, one can prove that the dimension of the kernel is equal to the number of dimensions that are 'lost" when we do the mapping.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 5 ·
Replies
5
Views
2K
  • · Replies 14 ·
Replies
14
Views
2K
  • · Replies 7 ·
Replies
7
Views
1K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 11 ·
Replies
11
Views
3K
  • · Replies 34 ·
2
Replies
34
Views
3K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 7 ·
Replies
7
Views
2K
  • · Replies 28 ·
Replies
28
Views
3K