Definition of Orthogonal Matrix: Case 1 or 2?

  • Context: Undergrad 
  • Thread starter Thread starter sjeddie
  • Start date Start date
  • Tags Tags
    Matrix Orthogonal
Click For Summary

Discussion Overview

The discussion centers on the definition of an orthogonal matrix, specifically whether it is defined as having all rows and columns orthonormal (case 1) or if it suffices for either rows or columns to be orthonormal (case 2). Participants explore implications of these definitions in the context of diagonalizable matrices and the properties of eigenvectors.

Discussion Character

  • Debate/contested
  • Technical explanation
  • Conceptual clarification

Main Points Raised

  • One participant questions whether an orthogonal matrix requires both rows and columns to be orthonormal or if it is sufficient for just one of them to be orthonormal.
  • Another participant asserts that the rows of a square matrix are orthonormal if and only if the columns are orthonormal, providing a mathematical argument involving the condition A^T A = I.
  • A further explanation is provided that if A^T A = I, then A is injective, leading to the conclusion that A A^T = I, which implies the rows are also orthonormal.
  • One participant mentions an alternative algebraic argument for injectivity, suggesting that different approaches can be used to understand the properties of orthogonal matrices.
  • A later reply expresses appreciation for the clarification regarding the relationship between A^T A = I and the properties of orthogonal matrices.

Areas of Agreement / Disagreement

Participants express differing views on the definition of orthogonal matrices, with some supporting the idea that both conditions must hold while others provide arguments suggesting that one condition suffices. The discussion remains unresolved regarding the preferred definition.

Contextual Notes

The discussion includes assumptions about finite-dimensional spaces, with a note that the behavior may differ in infinite-dimensional contexts, which is acknowledged but not resolved.

sjeddie
Messages
18
Reaction score
0
Is the definition of an orthogonal matrix:

1. a matrix where all rows are orthonormal AND all columns are orthonormal

OR

2. a matrix where all rows are orthonormal OR all columns are orthonormal?

On my textbook it said it is AND (case 1), but if that is true, there's a problem:
Say we have a square matrix A, and we find its eigenvectors, they are all distinct so A is diagonalizable. We put the normalized eigenvectors of A as the columns of a matrix P, and (our prof told us) P becomes orthogonal and P^-1 = P^T. My question is how did P become orthogonal straight away? By only normalizing its columns how did we guarantee that its rows are also orthonormal?
 
Physics news on Phys.org
It turns out that the rows of a square matrix are orthonormal if and only if the columns are orthonormal. Another way to express that the condition that all columns are orthonormal is that [tex]A^T A = I[/tex] (think about why this is). Then we see that if [tex]v \in \mathbb{R}^n[/tex], [tex]\parallel x \parallel^2 = x^T x = x^T ( A^T A ) x = ( A x )^T ( A x ) = \parallel A x \parallel^2[/tex], and therefore A is injective. Since we are working with finite-dimensional spaces, A must also be surjective, so for [tex]v \in \mathbb{R}^n[/tex], there exists [tex]w \in \mathbb{R}^n[/tex] with v = Aw, and therefore [tex]A A^T v = A A^T A w = A w = v[/tex], so [tex]A A^T = I[/tex] as well. You can check this this implies that the rows of A are orthonormal. The proof of the converse is similar.

Note that this argument relies on the finite-dimensionality of our vector space. If you move up to infinite dimensional spaces, there may be transforms T with [tex]T^*T = I[/tex] but [tex]T T^* \neq I[/tex]. This type of behavior is what makes functional analysis and operator algebras fun! :smile:
 
Last edited:
there's actually an easier way to see that [tex]A^T A = I[/tex] implies A is injective, I just tend to think in terms of isometries like I wrote. If v is such that Av = 0, then [tex]0 = A^T 0 = A^T A v = v[/tex], so A is injective. Some may prefer this purely algebraic argument.
 
Ah I see, thank you rochfor1, the (A^T)(A) = I thing makes a lot of sense :)
 

Similar threads

  • · Replies 14 ·
Replies
14
Views
4K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 3 ·
Replies
3
Views
2K
  • · Replies 13 ·
Replies
13
Views
4K
  • · Replies 7 ·
Replies
7
Views
3K
  • · Replies 4 ·
Replies
4
Views
3K
  • · Replies 2 ·
Replies
2
Views
3K
  • · Replies 33 ·
2
Replies
33
Views
3K
  • · Replies 3 ·
Replies
3
Views
4K
  • · Replies 14 ·
Replies
14
Views
5K