Create Row-Orthonormal Matrix | m > n

  • Thread starter Thread starter buupq
  • Start date Start date
  • Tags Tags
    Matrix Row
Click For Summary

Discussion Overview

The discussion revolves around the challenge of creating an approximate row-orthonormal matrix \( A_{m \times n} \) where the number of rows \( m \) exceeds the number of columns \( n \). Participants explore the mathematical implications of this setup, particularly focusing on the product \( A_{m \times n} A^T_{n \times m} \) and its relationship to the identity matrix \( I_{m \times m} \). The conversation includes technical reasoning, proposed methods, and clarifications regarding definitions and properties of matrices.

Discussion Character

  • Technical explanation
  • Debate/contested
  • Mathematical reasoning

Main Points Raised

  • One participant seeks to create a row-orthonormal matrix \( A_{m \times n} \) such that \( A_{m \times n} A^T_{n \times m} = I_{m \times m} \), but questions arise about the feasibility of this due to linear dependence when \( m > n \).
  • Another participant argues that the desired product cannot yield \( I_{m \times m} \) because the characteristic polynomial of \( A^T A \) would not support \( m \) eigenvalues of 1.
  • Some participants clarify the distinction between eigenvectors and the matrices of eigenvectors produced by numerical libraries like LAPACK.
  • There is a suggestion to approximate \( A_{m \times n} A^T_{n \times m} \) to \( I_{m \times m} \) using the trace of the Frobenius norm as a criterion for approximation quality.
  • One participant proposes generating a random matrix and using QR factorization to obtain orthonormal vectors, suggesting a method for constructing the desired matrix.
  • Another participant expresses concern about the complexity of the problem, indicating that the requirements have evolved significantly from the original question.

Areas of Agreement / Disagreement

Participants generally agree that an exact solution for \( A_{m \times n} A^T_{n \times m} = I_{m \times m} \) does not exist when \( m > n \) due to linear dependence. However, there is no consensus on the best method for approximating this product or the criteria for what constitutes a "good" approximation.

Contextual Notes

Participants note that the discussion involves various assumptions about matrix properties and the implications of using numerical methods. The conversation reflects differing interpretations of matrix definitions and the potential for confusion in terminology.

buupq
Messages
6
Reaction score
1
Hello,

I'm looking for a way to create an approximate row-orthonormal matrix with the number of rows (m) > the number of columns (n); i.e., finding A(mxn) so that A(mxn) . A^T(nxm) = I(mxm). I used singular value decomposition (e.g., DGESVD in mkl mathlib), but what I actually got was an orthonormal square eigenvector matrix.

Thank you!
 
Last edited:
Technology news on Phys.org
buupq said:
but what I actually got was an orthonormal square eigenvector
How can an eigenvector be square (i.e., a square matrix)?

An eigenvector is a nonzero vector x such that Ax = λx for some eigenvalue λ. Here x is a vector, an n x 1 matrix.
 
  • Like
Likes   Reactions: buupq
Mark44 said:
How can an eigenvector be square (i.e., a square matrix)?

An eigenvector is a nonzero vector x such that Ax = λx for some eigenvalue λ. Here x is a vector, an n x 1 matrix.

we have a set of eigenvalues, each correspondings to an eigenvector. What I mentioned was the collection of eigenvectors. If you use LAPACK or some mathlib, the output is usually a matrix of eigenvectors, matrix of eigenvalues, not just single one unless you request the output of only one.
 
Last edited:
There seems to be some linguistic confusion in this thread.

buupq said:
with the number of rows (m) > the number of columns (n); i.e., finding A(mxn) so that A(mxn) . A^T(nxm) = I(mxm).

obviously this is impossible. The ##\mathbf I_m## you are hoping to result from your product cannot exist.

Note that ##\big(\mathbf A \mathbf A^T\big)## and ##\big(\mathbf A^T \mathbf A\big)## have the same non-zero eigenvalues with same algebraic multiplicities. If what you're asking for exists, then ##\big(\mathbf A^T \mathbf A\big)## has eigenvalues of 1 with algebraic multiplicity of ##m## which is a contracition because its characteristic polynomial is a monic polynomial of degree ##n \lt m##.

- - - -

Other approaches to prove this fact are looking at this thread
https://www.physicsforums.com/threads/invertibility-of-a-matrix.917120/

or even using Cauchy Binet.
 
Last edited:
  • Like
Likes   Reactions: buupq
Yes, you are right. The ##A_{m\times n} A^T_{n\times m} = I_{m\times m}## does not exist if the number of rows > columns ##(m > n)## due to the linear dependence. Please see the rectangular matrix in the wiki page. https://en.wikipedia.org/wiki/Orthogonal_matrix.

Yeah, my question is a bit confusing. I should have asked if there were any way we can find ##A_{m\times n}## with ##m > n## so that (approximately) $$A_{m\times n} A^T_{n\times m} \approx I_{m\times m}$$
Note: There might be some differences in notations between my field of study and mathematics, ##A_{m\times n}## is an m-by-n matrix (##m## rows and ##n## columns); ##I_{m\times m}## is the m-by-m identity matrix.
 
Last edited:
buupq said:
Yeah, my question is a bit confusing. I should have asked if there were any way we can find ##A_{m\times n}## with ##m > n## so that (approximately) $$A_{m\times n} A^T_{n\times m} \approx I_{m\times m}$$

I can easily find

##
A_{m\times n} A^T_{n\times m} =
\begin{bmatrix}
\mathbf I_n & \mathbf 0\\
\mathbf 0 & \mathbf 0
\end{bmatrix}##

is that 'good' approximation? As stated, you cannot approximate the rank of ##\mathbf I_m##. What then are you trying to approximate? And what criteria are you using to score a given result as a 'good or bad' approximation? (Typically a metric...)

In general coming up with good approximations takes considerable sophistication. It feels like you may need to go back to the drawing board on this.
 
  • Like
Likes   Reactions: buupq
The approximation you gave is the output that I got from Singular value decomposition. What I try to approximate is the product ##A_{m\times n} A^T_{n\times m}## is as close as possible to the identity matrix ##I_m##, but there should be no columns/rows with only zero values. The criteria can be the trace of the Frobenius norm as in the procedure of finding the nearest orthonormal matrix http://people.csail.mit.edu/bkph/articles/Nearest_Orthonormal_Matrix.pdf.
 
Last edited:
buupq said:
The approximation you gave is the output that I got from Singular value decomposition. What I try to approximate is the product ##A_{m\times n} A^T_{n\times m}## is as close as possible to the identity matrix ##I_m##, but there should be no columns/rows with only zero values. The criteria can be the trace of the Frobenius norm as in the procedure of finding the nearest orthonormal matrix http://people.csail.mit.edu/bkph/articles/Nearest_Orthonormal_Matrix.pdf.

This is a huge amount of mission creep. It's a bit like telling someone you have a Linear Program to solve and then much later mentioning that you actually need everything to be integer valued -- a very different problem.

- - - -
I have a hunch that you want something like this

##A_{m\times n} A^T_{n\times m} =
\begin{bmatrix}
\mathbf I_{n-1} & \mathbf 0\\
\mathbf 0 & \mathbf 0
\end{bmatrix} + \epsilon \mathbf {11}^T
##

for sufficiently small ##\epsilon##. Of course, you can always choose a smaller ##\epsilon## mathematically, though if doing this on a computer you don't want it to be too small (for underflow reasons). Supposing you have a tolerance for a 'good enough' solution, this should fit.

the idea is that you'd have ##n-1## mutually orthonormal vectors -- each of which is orthogonal to the ones vector. Then add ##\sqrt{\frac{\epsilon}{n}}\mathbf 1## to each of those vectors.
 
Last edited:
  • Like
Likes   Reactions: buupq
and for avoidance of doubt:

## \mathbf 1 = \begin{bmatrix}
1\\
1\\
\vdots\\
1
\end{bmatrix}##
 
  • Like
Likes   Reactions: buupq
  • #10
Thanks very much StoneTemplePython! I'll try this.
 
  • #11
Now the question remains -- how would you get this?

I'd suggest doing generating a random ##n## x ##n## matrix (technically there are workarounds, but its fast and there is a nice interpretation of linear independence and determinants and geometry underlying this). Replace the first column of this matrix with the ones vector. Call this matrix ##\mathbf B##. Now run QR factorization -- ##\mathbf {QR} = \mathbf B##. Collect the ##n-1## columns of ##\mathbf Q## --i.e. collect all columns except the left most column which will be a rescaled version of the ones vector -- all other columns are mutually orthonormal and of course orthogonal to this first vector, the ones vector. Place the collection with ##m-n## more ##\mathbf 0## vectors in a matrix -- call it ##\mathbf A^T##.

There's probably some fine tuning needed with indexing or scaling but the big idea is you have something like

##
A_{m\times n}\big(A_{m\times n}\big)^T =
\begin{bmatrix}
\mathbf I_{n-1} & \mathbf 0\\
\mathbf 0 & \mathbf 0
\end{bmatrix} ##

and

##A_{m\times n} \mathbf 1_n\mathbf 1_m^T = \big(A_{m\times n} \mathbf 1_n\big)\mathbf 1_m^T = \mathbf 0_m\mathbf 1_m^T= \mathbf 0_m \mathbf 0_m^T##

ditto for the transpose

so

##\big(A_{m\times n} + \mathbf 1_m\mathbf 1_n^T\big)\big(A_{m\times n} + \mathbf 1_m\mathbf 1_n^T\big)^T = A_{m\times n}\big(A_{m\times n}\big)^T + \big(A_{m\times n}\mathbf 1_n\mathbf 1_m^T\big)^T + A_{m\times n}\mathbf 1_n\mathbf 1_m^T + \big(\mathbf 1_m\mathbf 1_n^T\big)\big(\mathbf 1_m\mathbf 1_n^T\big)^T = A_{m\times n}\big(A_{m\times n}\big)^T+ \mathbf 0_m \mathbf 0_m^T + \mathbf 0_m \mathbf 0_m^T +\big(\mathbf 1_m\mathbf 1_n^T\big)\big(\mathbf 1_m\mathbf 1_n^T\big)^T = A_{m\times n}\big(A_{m\times n}\big)^T +n \mathbf 1_m\mathbf 1_m^T##

now rescale the ones vector appropriately to fine tune the resulting matrix. - - -
note: to the extent you are ok with using complex numbers (uses twice as many bits though, I know, and could be numeric type issues, plus you'd need to be sure to use conjugate transpose rather than just transpose)...

then you actually already have this work done for you with the ##n## x ##n## Discrete Fourier transform matrix. You'd select all columns in it that aren't the (scaled) ones vector.
 
  • Like
Likes   Reactions: buupq

Similar threads

  • · Replies 2 ·
Replies
2
Views
2K
Replies
6
Views
1K
Replies
6
Views
2K
  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 8 ·
Replies
8
Views
1K
  • · Replies 25 ·
Replies
25
Views
3K
  • · Replies 4 ·
Replies
4
Views
2K
  • · Replies 6 ·
Replies
6
Views
4K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 2 ·
Replies
2
Views
3K