MHB Can we simplify the relations of condition number for orthogonal matrices?

mathmari
Gold Member
MHB
Messages
4,984
Reaction score
7
Hey! :o

I want to prove the following relations of condition number:
  1. $\operatorname{cond}(\alpha A)=\operatorname{cond}(A)$. The matrixnorm is submultiplicativ.
  2. $\operatorname{cond}_2(U)=1$ if $U$ is an orthogonal matrix.
  3. $\operatorname{cond}_2(UA)=\operatorname{cond}_2(A)$, $U$ is orthogonal.
  4. $\operatorname{cond}_2(A)\leq \operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$.
I have done the following :

  1. I have proven this equality, but I wanted to ask why it is given in this case that the matrix norm is submultiplicativ, we don't need this here, do we?
  2. Since $U$ is an orthogonal matrix, we have that $U^{-1}=U^T$.

    So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2$. How could we continue?
  3. We have that:

    $\operatorname{cond}_2(UA)=\|UA\|\|(UA)^{-1}\|=\|UA\|\|A^{-1}U^{-1}\|=\|UA\|\|A^{-1}U^T\|$

    We have to use here that $\operatorname{cond}_2(U)=1$, right? But how exactly?
  4. Could you give me a hint how we could show these inequalities?

(Wondering)
 
Mathematics news on Phys.org
mathmari said:
Hey! :o[*] $\operatorname{cond}(\alpha A)=\operatorname{cond}(A)$. The matrixnorm is submultiplicativ.

I have proven this equality, but I wanted to ask why it is given in this case that the matrix norm is submultiplicativ, we don't need this here, do we?

Hey mathmari! (Smile)

I agree that we do not need the norm to be submultiplicative.
Btw, we do need that $\alpha\ne 0$ don't we?

mathmari said:
[*] $\operatorname{cond}_2(U)=1$ if $U$ is an orthogonal matrix.

Since $U$ is an orthogonal matrix, we have that $U^{-1}=U^T$.

So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2$. How could we continue?

This is specifically about the matrix 2-norm, which is defined as $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ where $\|x\|_2$ is the standard Euclidean norm.
What does that mean for the norm if the matrix is orthogonal? (Wondering)

mathmari said:
[*] $\operatorname{cond}_2(UA)=\operatorname{cond}_2(A)$, $U$ is orthogonal.

We have that:

$\operatorname{cond}_2(UA)=\|UA\|\|(UA)^{-1}\|=\|UA\|\|A^{-1}U^{-1}\|=\|UA\|\|A^{-1}U^T\|$

We have to use here that $\operatorname{cond}_2(U)=1$, right? But how exactly?

Same thing. I believe we have to use the definition of the matrix 2-norm and a property of an orthogonal matrix. (Thinking)

mathmari said:
[*] $\operatorname{cond}_2(A)\leq \operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$.

Could you give me a hint how we could show these inequalities?

How about starting with the definitions of those norms and their properties? (Wondering)
 
I like Serena said:
I agree that we do not need the norm to be submulitplicative.
Btw, we do need that $\alpha\ne 0$ don't we?

Yes, this is given that $\alpha\in \mathbb{K}\setminus\{0\}$. (Yes)
I like Serena said:
This is specifically about the matrix 2-norm, which is defined as $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ where $\|x\|_2$ is the standard Euclidean norm.
What does that mean for the norm if the matrix is orthogonal? (Wondering)

We have that $\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=A^Tx^TAx=A^TAx^Tx=A^{-1}Ax^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$.

We also have that $\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^T)^T(A^Tx)=Ax^TA^Tx=AA^Tx^Tx=AA^{-1}x^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A^T\|_2=\sup_{x\ne 0}\frac{\|A^Tx\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$. So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2=1$.
Is evertything correct? (Wondering)
I like Serena said:
Same thing. I believe we have to use the definition of the matrix 2-norm and a property of an orthogonal matrix. (Thinking)

We have that $\|UA\|_2=\sup_{x\ne 0}\frac{\|(UA)x\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|U(Ax)\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\|A\|_2$ and $\|A^{-1}U^{T}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^T)x\|_2}{\|x\|_2}$

Does it hold that $A^{-1}U^T=U^TA^{-1}$ ? (Wondering)
I like Serena said:
How about starting with the definitions of those norms and their properties? (Wondering)

We have that $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2$, $\operatorname{cond}_F(A)=\|A\|_F\,\|A^{-1}\|_F$, $ \operatorname{cond}_{\infty}(A)=\|A\|_{\infty}\,\|A^{-1}\|_{\infty}$.

We have that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$, $\|A\|_F=\sup_{x\ne 0}\frac{\|Ax\|_F}{\|x\|_F}$, $\|A\|_{\infty}=\sup_{x\ne 0}\frac{\|Ax\|_{\infty}}{\|x\|_{\infty}}$.

Since $\displaystyle{(Ax)_i=\sum_{j=1}^na_{i,j}x_j}$, we get $\displaystyle{\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}x_j|\right )^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2}$
By the Cauchy-Schwarz inequality we get $\displaystyle{\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\left (\sum_{j=1}^n|x_j|^2\right )=\left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\|x\|_2^2}$
So, we get \begin{equation*}\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \sum_{i=1}^n\sum_{j=1}^n|a_{i,j}|^2\|x\|_2^2=\|A\|_F^2\,\|x\|_2^2 \end{equation*}

So, we have that $\|Ax\|_2\leq \|A\|_F\,\|x\|_2\Rightarrow \frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F\Rightarrow \sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F$, or not? (Wondering)

Since $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ it folows that $\|A\|_2\leq \|A\|_F$.

Equivalently, we get that $\|A^{-1}\|_2\leq \|A^{-1}\|_F$

So we get $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2\leq \|A\|_F\,\|A^{-1}\|_F=\operatorname{cond}_F(A)$. Is everything correct? (Wondering)

How do we get at the inequality $\operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$ the term $n$ ? (Wondering)
 
mathmari said:
We have that $\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=A^Tx^TAx=A^TAx^Tx=A^{-1}Ax^Tx=x^Tx=(x,x)=\|x\|_2$.

I'm afraid that $(Ax)^T$ is not equal to $A^Tx^T$, nor can we swap $x^TA$ around. (Worried)

Instead we have $(Ax)^T=x^TA^T$ after which we don't have to swap any more. (Happy)

mathmari said:
So, we get that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$.

We also have that $\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^T)^T(A^Tx)=Ax^TA^Tx=AA^Tx^Tx=AA^{-1}x^Tx=x^Tx=(x,x)=\|x\|_2$.

So, we get that $\|A^T\|_2=\sup_{x\ne 0}\frac{\|A^Tx\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|x\|_2}{\|x\|_2}=\sup_{x\ne 0}1=1$. So, we get $\operatorname{cond}_2( U)=\|U\|_2\,\| U^{-1} \|_2=\| U\|_2\,\| U^T \|_2=1$.

Is evertything correct? (Wondering)

Well... there's a typo in the word 'evertything'... (Worried)
And there's another transpose-swap problem for $\|A^Tx\|_2$.
Other than that it's correct!

mathmari said:
We have that $\|UA\|_2=\sup_{x\ne 0}\frac{\|(UA)x\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|U(Ax)\|_2}{\|x\|_2}=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}=\|A\|_2$ and $\|A^{-1}U^{T}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^T)x\|_2}{\|x\|_2}$

Does it hold that $A^{-1}U^T=U^TA^{-1}$ ?

Nope.
Can't we write instead that $\|A^{-1}U^{-1}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}
=\sup_{y\ne 0}\frac{\|A^{-1}y\|_2}{\|y\|_2}
$? (Wondering)

mathmari said:
We have that $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2$, $\operatorname{cond}_F(A)=\|A\|_F\,\|A^{-1}\|_F$, $ \operatorname{cond}_{\infty}(A)=\|A\|_{\infty}\,\|A^{-1}\|_{\infty}$.

We have that $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$, $\|A\|_F=\sup_{x\ne 0}\frac{\|Ax\|_F}{\|x\|_F}$, $\|A\|_{\infty}=\sup_{x\ne 0}\frac{\|Ax\|_{\infty}}{\|x\|_{\infty}}$.

Since $\displaystyle{(Ax)_i=\sum_{j=1}^na_{i,j}x_j}$, we get $\displaystyle{\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}x_j|\right )^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2}$
By the Cauchy-Schwarz inequality we get $\displaystyle{\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\left (\sum_{j=1}^n|x_j|^2\right )=\left (\sum_{j=1}^n{|a_{i,j}|^2}\right )\|x\|_2^2}$
So, we get \begin{equation*}\|Ax\|_2^2=\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}||x_j|\right )^2\leq \sum_{i=1}^n\sum_{j=1}^n|a_{i,j}|^2\|x\|_2^2=\|A\|_F^2\,\|x\|_2^2 \end{equation*}

So, we have that $\|Ax\|_2\leq \|A\|_F\,\|x\|_2\Rightarrow \frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F\Rightarrow \sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}\leq \|A\|_F$, or not? (Wondering)

Since $\|A\|_2=\sup_{x\ne 0}\frac{\|Ax\|_2}{\|x\|_2}$ it folows that $\|A\|_2\leq \|A\|_F$.

Equivalently, we get that $\|A^{-1}\|_2\leq \|A^{-1}\|_F$

So we get $\operatorname{cond}_2(A)=\|A\|_2\,\|A^{-1}\|_2\leq \|A\|_F\,\|A^{-1}\|_F=\operatorname{cond}_F(A)$. Is everything correct?

Yep. Looks good.

mathmari said:
How do we get at the inequality $\operatorname{cond}_F(A)\leq n\operatorname{cond}_{\infty}(A)$ the term $n$ ?

I'm not sure yet how to get the inequality, but that $n$ should be due to the fact that the Frobenius norm is the squared sum of all elements in all rows, while the maximum norm is represented by only 1 row. (Thinking)
 
I like Serena said:
I'm afraid that $(Ax)^T$ is not equal to $A^Tx^T$, nor can we swap $x^TA$ around. (Worried)

Instead we have $(Ax)^T=x^TA^T$ after which we don't have to swap any more. (Happy)

So, we have the following:
\begin{equation*}\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=x^TA^TAx=x^TA^{-1}Ax=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)
I like Serena said:
And there's another transpose-swap problem for $\|A^Tx\|_2$.
Other than that it's correct!

We have that \begin{equation*}\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^Tx)^T(A^Tx)=x^TAA^Tx=x^TAA^{-1}x=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)
I like Serena said:
Nope.
Can't we write instead that $\|A^{-1}U^{-1}\|_2=\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}
=\sup_{y\ne 0}\frac{\|A^{-1}y\|_2}{\|y\|_2}
$? (Wondering)

How do we get the equality $\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}$ ? (Wondering)
I like Serena said:
I'm not sure yet how to get the inequality, but that $n$ should be due to the fact that the Frobenius norm is the squared sum of all elements in all rows, while the maximum norm is represented by only 1 row. (Thinking)

Does it maybe hold the following?
\begin{equation*}\|A\|_F=\sqrt{\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}\leq \sqrt{n\cdot \max_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}=\sqrt{n\cdot \|A\|_{\infty}}\end{equation*}

(Wondering)
 
mathmari said:
So, we have the following:
\begin{equation*}\|Ax\|_2=\sqrt{(Ax,Ax)}=(Ax)^T(Ax)=x^TA^TAx=x^TA^{-1}Ax=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)

We have that \begin{equation*}\|A^Tx\|_2=\sqrt{(A^Tx,A^Tx)}=(A^Tx)^T(A^Tx)=x^TAA^Tx=x^TAA^{-1}x=x^Tx=(x,x)=\|x\|_2\end{equation*}

(Nerd)

Yep. (Nod)

mathmari said:
How do we get the equality $\sup_{x\ne 0}\frac{\|(A^{-1}U^{-1})x\|_2}{\|x\|_2}
=\sup_{x\ne 0}\frac{\|A^{-1}(U^{-1}x)\|_2}{\|U^{-1}x\|_2}$ ?

Since $U$ is an orthogonal matrix, $U^{-1}$ is also orthogonal isn't it?
And didn't we just find that $\|Ux\|=\|x\|$? (Wondering)

mathmari said:
Does it maybe hold the following?
\begin{equation*}\|A\|_F=\sqrt{\sum_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}\leq \sqrt{n\cdot \max_{i=1}^n\left (\sum_{j=1}^n|a_{i,j}|^2\right )}=\sqrt{n\cdot \|A\|_{\infty}}\end{equation*}

Let $B=(b_{ij})=A^{-1}$.
Then shouldn't that be:
$$\DeclareMathOperator\cond{cond}
\cond_F A \overset ?\le n\cond_\infty A
\iff \|A\|_F\|A^{-1}\|_F \overset ?\le n\|A\|_\infty \|A^{-1}\|_\infty
\iff \sqrt{\sum_{i,j} |a_{ij}|^2}\sqrt{\sum_{i,j} |b_{ij}|^2} \overset ?\le n \max_i\sum_j|a_{ij}| \max_i\sum_j|b_{ij}|
$$
(Wondering)
 
Last edited:
I like Serena said:
Since $U$ is an orthogonal matrix, $U^{-1}$ is also orthogonal isn't it?
And didn't we just find that $\|Ux\|=\|x\|$? (Wondering)

Ahh I see! (Yes)
I like Serena said:
Let $B=(b_{ij})=A^{-1}$.
Then shouldn't that be:
$$\DeclareMathOperator\cond{cond}
\cond_F A \overset ?\ge n\cond_\infty A
\iff \|A\|_F\|A^{-1}\|_F \overset ?\ge n\|A\|_\infty \|A^{-1}\|_\infty
\iff \sqrt{\sum_{i,j} |a_{ij}|^2}\sqrt{\sum_{i,j} |b_{ij}|^2} \overset ?\ge n \max_i\sum_j|a_{ij}| \max_i\sum_j|b_{ij}|
$$
(Wondering)

Do we not have to show the inequality with $\leq$ instead of $\geq$ ? I got stuck right now. (Wondering)
 
mathmari said:
Do we not have to show the inequality with $\leq$ instead of $\geq$ ? I got stuck right now. (Wondering)

Yes, you are correct.
I've fixed the inequalities in my previous post.

So if we could prove that $\sqrt{\sum_{ij} |a_{ij}|^2} \le \sqrt n \max_i \sum_j |a_{ij}|
\iff \sum_{ij} |a_{ij}|^2 \le n \left(\!\max_i \sum_j |a_{ij}|\!\right)^{\!2}
$ for any matrix A, we would be done, wouldn't we? (Thinking)
 
I like Serena said:
So if we could prove that $\sqrt{\sum_{ij} |a_{ij}|^2} \le \sqrt n \max_i \sum_j |a_{ij}|
\iff \sum_{ij} |a_{ij}|^2 \le n \left(\!\max_i \sum_j |a_{ij}|\!\right)^{\!2}
$ for any matrix A, we would be done, wouldn't we? (Thinking)

Do we have the following?
\begin{equation*}\sum_{i=1}^n\sum_{j=1}^n |a_{ij}|^2\overset{ (\star) }{ \leq } \sum_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\overset{ (\star\star) }{ \leq } n\cdot \max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\end{equation*}

$(\star)$: It holds that $(a+b)^2=a^2+2ab+b^2\geq a^2+b^2$. By induction, we can show that $\displaystyle{\sum_{j=1}^n |a_{ij}|^2\leq \left (\sum_{j=1}^n |a_{ij}|\right )^2}$.

$(\star\star)$: The sum of $n$ terms is less than $n$ times the largest term.
 
  • #10
Yep. Looks good (for non-negative a and b)! (Nod)
 
  • #11
So, we have that
\begin{equation*}\|A\|_F^2=\sum_{i=1}^n\sum_{j=1}^n |a_{ij}|^2\leq n\cdot \max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2\end{equation*}

We also have that $\displaystyle{\|A\|_{\infty}=\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| }$.

Then $\displaystyle{\|A\|_{\infty}^2=\left (\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| \right )^2}$. Does it hold that $\displaystyle{\left (\max_{i=1}^n \sum_{j=1}^n|a_{i,j}| \right )^2=\max_{i=1}^n\left (\sum_{j=1}^n |a_{ij}|\right )^2}$ ?
 
  • #12
Is $\max(a,b)^2 \overset ?= \max(a^2,b^2)$? (Wondering)
 
  • #13
I like Serena said:
Is $\max(a,b)^2 \overset ?= \max(a^2,b^2)$? (Wondering)

Let $a<b$ then $a^2<b^2$, i.e. $\max (a,b)=b$ and $\max (a^2, b^2)=b^2$.
That means that $\max(a,b)^2=b^2=\max(a^2,b^2)$. Thank you so much! (Happy)
 

Similar threads

Back
Top