MHB Can we simplify the relations of condition number for orthogonal matrices?

AI Thread Summary
The discussion focuses on proving relationships involving the condition number of orthogonal matrices and their properties. Key points include the equality of condition numbers when scaling a matrix by a non-zero scalar, the fact that the condition number of an orthogonal matrix is 1, and the preservation of condition numbers under multiplication by orthogonal matrices. Participants clarify the need for submultiplicativity of matrix norms and explore the inequalities relating different types of condition numbers, such as the 2-norm, Frobenius norm, and infinity norm. The conversation emphasizes the properties of orthogonal matrices and their implications for condition numbers, leading to further inquiries about proving specific inequalities.
mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

I want to prove the following relations of condition number:
  1. $\operatorname{cond}(\alpha A)=\operatorname{cond}(A)$. The matrixnorm is submultiplicativ.
  2. $\operatorname{cond}_2(U)=1$ if $U$ is an orthogonal matrix.
  3. $\operatorname{cond}_2(UA)=\operatorname{cond}_2(A)$, $U$ is orthogonal.
  4. $\operatorname{cond}_2(A)\leq \operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$.
I have done the following :

  1. I have proven this equality, but I wanted to ask why it is given in this case that the matrix norm is submultiplicativ, we don't need this here, do we?
  2. Since $U$ is an orthogonal matrix, we have that $U^{-1}=U^T$.

    So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2$. How could we continue?
  3. We have that:

    $\operatorname{cond}_2(UA)=\|UA\|\|(UA)^{-1}\|=\|UA\|\|A^{-1}U^{-1}\|=\|UA\|\|A^{-1}U^T\|$

    We have to use here that $\operatorname{cond}_2(U)=1$, right? But how exactly?
  4. Could you give me a hint how we could show these inequalities?

(Wondering)
 
Mathematics news on Phys.org
mathmari said:
Hey! :o[*] $\operatorname{cond}(\alpha A)=\operatorname{cond}(A)$. The matrixnorm is submultiplicativ.

I have proven this equality, but I wanted to ask why it is given in this case that the matrix norm is submultiplicativ, we don't need this here, do we?

Hey mathmari! (Smile)

I agree that we do not need the norm to be submultiplicative.
Btw, we do need that $\alpha\ne 0$ don't we?

mathmari said:
[*] $\operatorname{cond}_2(U)=1$ if $U$ is an orthogonal matrix.

Since $U$ is an orthogonal matrix, we have that $U^{-1}=U^T$.

So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2$. How could we continue?

This is specifically about the matrix 2-norm, which is defined as $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ where $\|x\|_2$ is the standard Euclidean norm.
What does that mean for the norm if the matrix is orthogonal? (Wondering)

mathmari said:
[*] $\operatorname{cond}_2(UA)=\operatorname{cond}_2(A)$, $U$ is orthogonal.

We have that:

$\operatorname{cond}_2(UA)=\|UA\|\|(UA)^{-1}\|=\|UA\|\|A^{-1}U^{-1}\|=\|UA\|\|A^{-1}U^T\|$

We have to use here that $\operatorname{cond}_2(U)=1$, right? But how exactly?

Same thing. I believe we have to use the definition of the matrix 2-norm and a property of an orthogonal matrix. (Thinking)

mathmari said:
[*] $\operatorname{cond}_2(A)\leq \operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$.

Could you give me a hint how we could show these inequalities?

How about starting with the definitions of those norms and their properties? (Wondering)
 
I like Serena said:
I agree that we do not need the norm to be submulitplicative.
Btw, we do need that $\alpha\ne 0$ don't we?

Yes, this is given that $\alpha\in \mathbb{K}\setminus\{0\}$. (Yes)
I like Serena said:
This is specifically about the matrix 2-norm, which is defined as $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ where $\|x\|_2$ is the standard Euclidean norm.
What does that mean for the norm if the matrix is orthogonal? (Wondering)

We have that $\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=A^Tx^TAx=A^TAx^Tx=A^{-1}Ax^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$.

We also have that $\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^T)^T(A^Tx)=Ax^TA^Tx=AA^Tx^Tx=AA^{-1}x^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A^T\|_2=\sup_{x\ne 0}\frac{\|A^Tx\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$. So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2=1$.
Is evertything correct? (Wondering)
I like Serena said:
Same thing. I believe we have to use the definition of the matrix 2-norm and a property of an orthogonal matrix. (Thinking)

We have that $\|UA\|_2=\sup_{x\ne 0}\frac{\|(UA)x\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|U(Ax)\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\|A\|_2$ and $\|A^{-1}U^{T}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^T)x\|_2}{\|x\|_2}$

Does it hold that $A^{-1}U^T=U^TA^{-1}$ ? (Wondering)
I like Serena said:
How about starting with the definitions of those norms and their properties? (Wondering)

We have that $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2$, $\operatorname{cond}_F(A)=\|A\|_F\,\|A^{-1}\|_F$, $ \operatorname{cond}_{\infty}(A)=\|A\|_{\infty}\,\|A^{-1}\|_{\infty}$.

We have that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$, $\|A\|_F=\sup_{x\ne 0}\frac{\|Ax\|_F}{\|x\|_F}$, $\|A\|_{\infty}=\sup_{x\ne 0}\frac{\|Ax\|_{\infty}}{\|x\|_{\infty}}$.

Since $\displaystyle{(Ax)_i=\sum_{j=1}^na_{i,j}x_j}$, we get $\displaystyle{\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}x_j|\right )^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2}$
By the Cauchy-Schwarz inequality we get $\displaystyle{\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\left (\sum_{j=1}^n|x_j|^2\right )=\left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\|x\|_2^2}$
So, we get \begin{equation*}\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \sum_{i=1}^n\sum_{j=1}^n|a_{i,j}|^2\|x\|_2^2=\|A\|_F^2\,\|x\|_2^2 \end{equation*}

So, we have that $\|Ax\|_2\leq \|A\|_F\,\|x\|_2\Rightarrow \frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F\Rightarrow \sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F$, or not? (Wondering)

Since $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ it folows that $\|A\|_2\leq \|A\|_F$.

Equivalently, we get that $\|A^{-1}\|_2\leq \|A^{-1}\|_F$

So we get $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2\leq \|A\|_F\,\|A^{-1}\|_F=\operatorname{cond}_F(A)$. Is everything correct? (Wondering)

How do we get at the inequality $\operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$ the term $n$ ? (Wondering)
 
mathmari said:
We have that $\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=A^Tx^TAx=A^TAx^Tx=A^{-1}Ax^Tx=x^Tx=(x,x)=\|x\|_2$.

I'm afraid that $(Ax)^T$ is not equal to $A^Tx^T$, nor can we swap $x^TA$ around. (Worried)

Instead we have $(Ax)^T=x^TA^T$ after which we don't have to swap any more. (Happy)

mathmari said:
So, we get that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$.

We also have that $\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^T)^T(A^Tx)=Ax^TA^Tx=AA^Tx^Tx=AA^{-1}x^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A^T\|_2=\sup_{x\ne 0}\frac{\|A^Tx\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$. So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2=1$.

Is evertything correct? (Wondering)

Well... there's a typo in the word 'evertything'... (Worried)
And there's another transpose-swap problem for $\|A^Tx\|_2$.
Other than that it's correct!

mathmari said:
We have that $\|UA\|_2=\sup_{x\ne 0}\frac{\|(UA)x\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|U(Ax)\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\|A\|_2$ and $\|A^{-1}U^{T}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^T)x\|_2}{\|x\|_2}$

Does it hold that $A^{-1}U^T=U^TA^{-1}$ ?

Nope.
Can't we write instead that $\|A^{-1}U^{-1}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}
=\sup_{y\ne 0}\frac{\|A^{-1}y\|_2}{\|y\|_2}
$? (Wondering)

mathmari said:
We have that $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2$, $\operatorname{cond}_F(A)=\|A\|_F\,\|A^{-1}\|_F$, $ \operatorname{cond}_{\infty}(A)=\|A\|_{\infty}\,\|A^{-1}\|_{\infty}$.

We have that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$, $\|A\|_F=\sup_{x\ne 0}\frac{\|Ax\|_F}{\|x\|_F}$, $\|A\|_{\infty}=\sup_{x\ne 0}\frac{\|Ax\|_{\infty}}{\|x\|_{\infty}}$.

Since $\displaystyle{(Ax)_i=\sum_{j=1}^na_{i,j}x_j}$, we get $\displaystyle{\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}x_j|\right )^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2}$
By the Cauchy-Schwarz inequality we get $\displaystyle{\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\left (\sum_{j=1}^n|x_j|^2\right )=\left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\|x\|_2^2}$
So, we get \begin{equation*}\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \sum_{i=1}^n\sum_{j=1}^n|a_{i,j}|^2\|x\|_2^2=\|A\|_F^2\,\|x\|_2^2 \end{equation*}

So, we have that $\|Ax\|_2\leq \|A\|_F\,\|x\|_2\Rightarrow \frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F\Rightarrow \sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F$, or not? (Wondering)

Since $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ it folows that $\|A\|_2\leq \|A\|_F$.

Equivalently, we get that $\|A^{-1}\|_2\leq \|A^{-1}\|_F$

So we get $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2\leq \|A\|_F\,\|A^{-1}\|_F=\operatorname{cond}_F(A)$. Is everything correct?

Yep. Looks good.

mathmari said:
How do we get at the inequality $\operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$ the term $n$ ?

I'm not sure yet how to get the inequality, but that $n$ should be due to the fact that the Frobenius norm is the squared sum of all elements in all rows, while the maximum norm is represented by only 1 row. (Thinking)
 
I like Serena said:
I'm afraid that $(Ax)^T$ is not equal to $A^Tx^T$, nor can we swap $x^TA$ around. (Worried)

Instead we have $(Ax)^T=x^TA^T$ after which we don't have to swap any more. (Happy)

So, we have the following:
\begin{equation*}\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=x^TA^TAx=x^TA^{-1}Ax=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)
I like Serena said:
And there's another transpose-swap problem for $\|A^Tx\|_2$.
Other than that it's correct!

We have that \begin{equation*}\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^Tx)^T(A^Tx)=x^TAA^Tx=x^TAA^{-1}x=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)
I like Serena said:
Nope.
Can't we write instead that $\|A^{-1}U^{-1}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}
=\sup_{y\ne 0}\frac{\|A^{-1}y\|_2}{\|y\|_2}
$? (Wondering)

How do we get the equality $\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}$ ? (Wondering)
I like Serena said:
I'm not sure yet how to get the inequality, but that $n$ should be due to the fact that the Frobenius norm is the squared sum of all elements in all rows, while the maximum norm is represented by only 1 row. (Thinking)

Does it maybe hold the following?
\begin{equation*}\|A\|_F=\sqrt{\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}\leq \sqrt{n\cdot \max_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}=\sqrt{n\cdot \|A\|_{\infty}}\end{equation*}

(Wondering)
 
mathmari said:
So, we have the following:
\begin{equation*}\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=x^TA^TAx=x^TA^{-1}Ax=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)

We have that \begin{equation*}\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^Tx)^T(A^Tx)=x^TAA^Tx=x^TAA^{-1}x=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)

Yep. (Nod)

mathmari said:
How do we get the equality $\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}$ ?

Since $U$ is an orthogonal matrix, $U^{-1}$ is also orthogonal isn't it?
And didn't we just find that $\|Ux\|=\|x\|$? (Wondering)

mathmari said:
Does it maybe hold the following?
\begin{equation*}\|A\|_F=\sqrt{\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}\leq \sqrt{n\cdot \max_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}=\sqrt{n\cdot \|A\|_{\infty}}\end{equation*}

Let $B=(b_{ij})=A^{-1}$.
Then shouldn't that be:
$$\DeclareMathOperator\cond{cond}
\cond_F A \overset ?\le n\cond_\infty A
\iff \|A\|_F\|A^{-1}\|_F \overset ?\le n\|A\|_\infty \|A^{-1}\|_\infty
\iff \sqrt{\sum_{i,j} |a_{ij}|^2}\sqrt{\sum_{i,j} |b_{ij}|^2} \overset ?\le n \max_i\sum_j|a_{ij}| \max_i\sum_j|b_{ij}|
$$
(Wondering)
 
Last edited:
I like Serena said:
Since $U$ is an orthogonal matrix, $U^{-1}$ is also orthogonal isn't it?
And didn't we just find that $\|Ux\|=\|x\|$? (Wondering)

Ahh I see! (Yes)
I like Serena said:
Let $B=(b_{ij})=A^{-1}$.
Then shouldn't that be:
$$\DeclareMathOperator\cond{cond}
\cond_F A \overset ?\ge n\cond_\infty A
\iff \|A\|_F\|A^{-1}\|_F \overset ?\ge n\|A\|_\infty \|A^{-1}\|_\infty
\iff \sqrt{\sum_{i,j} |a_{ij}|^2}\sqrt{\sum_{i,j} |b_{ij}|^2} \overset ?\ge n \max_i\sum_j|a_{ij}| \max_i\sum_j|b_{ij}|
$$
(Wondering)

Do we not have to show the inequality with $\leq$ instead of $\geq$ ? I got stuck right now. (Wondering)
 
mathmari said:
Do we not have to show the inequality with $\leq$ instead of $\geq$ ? I got stuck right now. (Wondering)

Yes, you are correct.
I've fixed the inequalities in my previous post.

So if we could prove that $\sqrt{\sum_{ij} |a_{ij}|^2} \le \sqrt n \max_i \sum_j |a_{ij}|
\iff \sum_{ij} |a_{ij}|^2 \le n \left(\!\max_i \sum_j |a_{ij}|\!\right)^{\!2}
$ for any matrix A, we would be done, wouldn't we? (Thinking)
 
I like Serena said:
So if we could prove that $\sqrt{\sum_{ij} |a_{ij}|^2} \le \sqrt n \max_i \sum_j |a_{ij}|
\iff \sum_{ij} |a_{ij}|^2 \le n \left(\!\max_i \sum_j |a_{ij}|\!\right)^{\!2}
$ for any matrix A, we would be done, wouldn't we? (Thinking)

Do we have the following?
\begin{equation*}\sum_{i=1}^n\sum_{j=1}^n |a_{ij}|^2\overset{ (\star) }{ \leq } \sum_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\overset{ (\star\star) }{ \leq } n\cdot \max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\end{equation*}

$(\star)$: It holds that $(a+b)^2=a^2+2ab+b^2\geq a^2+b^2$. By induction, we can show that $\displaystyle{\sum_{j=1}^n |a_{ij}|^2\leq \left (\sum_{j=1}^n |a_{ij}|\right )^2}$.

$(\star\star)$: The sum of $n$ terms is less than $n$ times the largest term.
 
  • #10
Yep. Looks good (for non-negative a and b)! (Nod)
 
  • #11
So, we have that
\begin{equation*}\|A\|_F^2=\sum_{i=1}^n\sum_{j=1}^n |a_{ij}|^2\leq n\cdot \max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\end{equation*}

We also have that $\displaystyle{\|A\|_{\infty}=\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| }$.

Then $\displaystyle{\|A\|_{\infty}^2=\left (\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| \right )^2}$. Does it hold that $\displaystyle{\left (\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| \right )^2=\max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2}$ ?
 
  • #12
Is $\max(a,b)^2 \overset ?= \max(a^2,b^2)$? (Wondering)
 
  • #13
I like Serena said:
Is $\max(a,b)^2 \overset ?= \max(a^2,b^2)$? (Wondering)

Let $a<b$ then $a^2<b^2$, i.e. $\max (a,b)=b$ and $\max (a^2, b^2)=b^2$.
That means that $\max(a,b)^2=b^2=\max(a^2,b^2)$. Thank you so much! (Happy)
 

Similar threads

Back
Top