Challenge Math Challenge - January 2021

  • #51
Fred Wright said:
Problem 4.
I solve the PDE$$
xu_x+yu_y+(x^2 +y^2)u_z=0
$$
by the "method of characteristics". Dividing by ##x## the PDE becomes
$$

u_x + \frac{y}{x}u_y + \frac{(x^2 + y^2)}{x}u_z=0
$$
The characteristic ODEs are,
$$
\frac{dx}{ds}=1
$$
$$
\frac{dy}{ds}=\frac{y}{x}
$$
$$
\frac{dz}{ds}=\frac{(x^2 + y^2)}{x}
$$
where ##s## is the characteristic parameter. Solving these ODEs we find,
$$
x=s
$$
$$
y=y_0x
$$
$$
z=z_0 + \frac{1}{2}(x^2 + y^2)
$$
The solution is any function of the constant curves,
$$
u(x,y,z)=F(y_0,z_0)
$$
Solving for ##y_0## and ##z_0## I find,
$$u(x,y,z)=F(\frac{y}{x},z-\frac{1}{2}(x^2 + y^2))
$$
In anticipation of the initial conditions I displace the ##z## coordinate by ##\frac{\pi}{2}## and guess the solution.
$$
u(x,y,z)=\frac{y}{x}\left ( z-\frac{\pi}{2}-\frac{1}{2}(x^2 + y^2)\right )
$$
Taking the partial derivatives I find,
$$
u_x=-\frac{y}{x^2}\left ( z-\frac{\pi}{2}-\frac{1}{2}(x^2 + y^2)\right )-y
$$
$$
u_y=\frac{1}{x}\left ( z-\frac{\pi}{2}-\frac{1}{2}(x^2 + y^2)\right )-\frac{y^2}{x}
$$
$$
u_z=\frac{y}{x}
$$
This function satisfies the initial conditions and the original PDE.
Well, ##F## should be differentiable, not just any. We have infinitely many possible solutions and the initial values are useless. Those initial values given here corresponded to ##F(a,b)=a^2\sin(b),## but this is not a unique solution, as your solution shows.
 
Physics news on Phys.org
  • #52
My contribution to problem 7. Lots of simple inequalities and few indices, which is more my style. The little lemmas also give some other valuable information about symmetric matrices, via very simple means.

Let ##A## be a real symmetric matrix in all cases. And ##\mathbf e_k## denotes the relevant kth standard basis vector.

lemma 1:
a ##2\times 2## real symmetric ##A## with quadratic form
##f: \mathbb R_{\big \Vert \mathbb x\big \Vert_2=1}^2 \longrightarrow \mathbb R## given by ##f\big(\mathbf x\big)=\mathbf x^T A \mathbf x## (notice the domain is the unit circle), attains a maximum, ##m_x##

proof:
##f## is continuous, applied to a compact set, so its image is compact i.e. in this case closed and bounded in ##\mathbb R##, so it attains a maximum, ##m_x##.

lemma 2:
If ##A## is a ##2\times 2##, non-diagonal matrix with (at least) one zero on the diagonal, then ##m_x \gt 0##.

proof:
WLOG suppose ##a_{1,1}=0## (if not, re-label and run the argument on ##A':= P^TAP=PAP## where ##P## is the (non-identity) type 2 elementary matrix)

##f\big(\mathbf e_1\big) = 0##
now ##m_x \leq 0\implies f\big(\mathbf x_{m_x}\big) \leq 0 \implies \sigma^2 \cdot f\big( \mathbf x_{m_x}\big) =f\big(\sigma \cdot \mathbf x_{m_x}\big) \leq 0##
thus for notational ease we relax the length one constraint on ##\mathbf x## for this lemma only (i.e. let the domain be all of ##\mathbb R^2## for this lemma only)

so consider ##\mathbf x:=\mathbf e_1 + \beta \mathbf e_2##
##f\big(\mathbf x\big) = \mathbf x^TA\mathbf x = 2\cdot a_{1,2}\cdot x_{1}\cdot x_{2} + a_{2,2}\cdot x_{2}^2 =2\cdot \beta \cdot a_{1,2}+ \beta^2 a_{2,2} = \beta\big(\beta a_{2,2} + a_{1,2}\big) ##
i.) if ##a_{1,2} \gt 0## then the RHS is necessarily positive for small enough modulus ##\beta \gt 0##
(i.e. if ##a_{2,2}=0## then any positive ##\beta## works, otherwise we have
##\beta a_{2,2} + a_{1,2}\geq a_{1,2}-\beta \big \vert a_{2,2}\big \vert\geq \frac{1}{2}a_{1,2}\gt 0##
by selecting ##\beta \in \big(0, \frac{a_{1,2}}{2 \vert a_{2,2}\vert }\big)##
ii.) similarly if ##a_{1,2} \lt 0## then the RHS is necessarily positive for small enough modulus ##\beta \lt 0##
iii.) ##a_{1,2} \neq 0## because ##A## is symmetric but not diagonal

this complete the proof of the lemma, i.e. ##m_x \gt 0##

For ##2\times 2## real symmetric ##A##, ##m_x \geq \max\big(a_{1,1},a_{2,2}\big)## with equality *iff* ##A## is diagonal.

For the first direction: if ##A## is diagonal, then
##m_x=f\big(\mathbf x_{m_x}\big) = a_{1,1}\cdot x_{1}^2 + a_{2,2}\cdot x_{2}^2\leq \max\big(a_{1,1},a_{2,2}\big)\cdot \big(x_1^2 + x_2^2)= \max\big(a_{1,1},a_{2,2}\big)##
and ##\max\Big(f\big(\mathbf e_{1}\big), f\big(\mathbf e_{2}\big)\Big) =\max\big(a_{1,1},a_{2,2}\big)\leq m_x ## so this is met with equality.

For the other direction, suppose ##A## is not diagonal. Then ##B:=A-\max\big(a_{1,1},a_{2,2}\big)\cdot I##
is symmetric ##2\times 2## but not diagonal, with a zero on the diagonal. Applying the prior lemma to ##B## tells us that the maximal quadratic form is positive. i.e. there must exist (length one) ##\mathbf x## where

##0 \lt \mathbf x^T \Big(A-\max\big(a_{1,1},a_{2,2}\big)\cdot I\Big)\mathbf x = f\big(\mathbf x\big) - \max\big(a_{1,1},a_{2,2}\big)\mathbf x^T \mathbf x = f\big(\mathbf x\big)-\max\big(a_{1,1},a_{2,2}\big) \leq m_x -\max\big(a_{1,1},a_{2,2}\big)##
##\implies \max\big(a_{1,1},a_{2,2}\big) \lt m_x##
(where ##m_x## refers to maximum value of the quadratic form with respect to A, not B, i.e ##\mathbf x^T A \mathbf x##)

**now assume** ##A## is ##n\times n##
##f\big(\mathbf x\big)=\mathbf x^T A \mathbf x## (again with the constraint of ##\Big \Vert \mathbf x\big \Vert_2 = 1##) attains a maximum ##m_x## again for reasons of applying a continuous function to a compact domain, with the real line as the codomain. Define orthogonal ##Q_1## to have its first col vector ##\mathbf x_{m_x}## (i.e. some unit length vector such that ##f\big(\mathbf x_{m_x}\big) = m_x## and extend to an orthonormal basis.)

##B:= Q_1^T A Q_1 = \begin{bmatrix}
m_x & *^T\\
* & A_{n-1\times n-1}\\
\end{bmatrix}##
note: ##B## is symmetric, so is ##A_{n-1\times n-1}## and we claim ##*=\mathbf 0_{n-1}##. Suppose this is not the case and there is ##b_{i,1}=\gamma \neq 0## for some ##i\in \big\{2,3,...,n\big\}##. Let ##P## be the elementary type 2 matrix that swaps element i with element 2 (which is a symmetric permutation matrix).

Then ##P^TBP## has a leading ##2\times 2## principal submatrix given by
##\begin{bmatrix}
m_x & \gamma\\
\gamma & b_{i,i}\\ \end{bmatrix} ##

By the lemma 3, since this ##2 \times 2## symmetric matrix is not diagonal there is some maximizing ##\mathbf x## on the unit circle such that ##\eta =\mathbf x^T\begin{bmatrix}
m_x & \gamma\\
\gamma & *\\\end{bmatrix} \mathbf x \gt \max\big(b_{1,1},b_{i,i}\big)= \max\big(m_x,*\big)\geq m_x##

Finally, define
##\mathbf v :=Q_1P\begin{bmatrix}
\mathbf x \\
\mathbf 0_{n-2}\\\end{bmatrix} ##, noting that ##\Big \Vert \mathbf v\Big \Vert_2 = 1## because ##Q_1## and ##P## are orthogonal

thus ##\mathbf v^T A\mathbf v = \eta \gt m_x## which is a contradiction.

Thus we've proven
##B= Q_1^T A Q_1 = \begin{bmatrix}
m_x & \mathbf 0_{n-1}^T\\
\mathbf 0_{n-1} & A_{n-1\times n-1}\\
\end{bmatrix}## and by inductive hypothesis (with the trivial base case of ##A_{n-1\times n-1}## being ##1\times 1##) there is an orthogonal matrix ##Q_{n-1\times n-1}## such that ##Q_{n-1\times n-1}^T A_{n-1\times n-1} Q_{n-1\times n-1} = D_{n-1}## so
##Q_2:= \begin{bmatrix}
1 & \mathbf 0_{n-1}^T\\
\mathbf 0_{n-1} & Q_{n-1\times n-1}\\
\end{bmatrix}##

##Q:= Q_1Q_2## and ##Q^TAQ = \begin{bmatrix}
m_x & \mathbf 0_{n-1}^T\\
\mathbf 0_{n-1} & D_{n-1\times n-1}\\
\end{bmatrix}=D## as desired
 
  • Informative
  • Like
Likes nuuskur and fresh_42
  • #53
Just one question about
StoneTemplePython said:
lemma 2:
If ##A## is a ##2\times 2##, non-diagonal matrix with (at least) one zero on the diagonal, then ##m_x \gt 0##.

proof:
WLOG suppose ##a_{1,1}=0## (if not, re-label and run the argument on ##A':= P^TAP=PAP## where ##P## is the (non-identity) type 2 elementary matrix)

##f\big(\mathbf e_1\big) = 0##
Why isn't the proof done here? If we have a point with ##f(\mathbf{e_1})=0,## isn't the maximum ##f(\mathbf{x}_{m_x})## automatically non-negative? The case ##f(\mathbf{x}_{m_x})=0## is easy to exclude.
 
  • #54
fresh_42 said:
Just one question about

Why isn't the proof done here? If we have a point with ##f(\mathbf{e_1})=0,## isn't the maximum ##f(\mathbf{x}_{m_x})## automatically non-negative? The case ##f(\mathbf{x}_{m_x})=0## is easy to exclude.
Agreed that non-negativity is immediate. I wanted to exclude ##f(\mathbf{x}_{m_x})=0## in a methodical way (without using eigenvectors, bilinear forms, etc.). How would you do it?
 
  • #55
StoneTemplePython said:
Agreed that non-negativity is immediate. I wanted to exclude ##f(\mathbf{x}_{m_x})=0## in a methodical way (without using eigenvectors, bilinear forms, etc.). How would you do it?
If ##a_{22}\neq 0## then ##\mathbf{x}=(0,1)## is positive, and if ##a_{22}=0 \wedge a_{12}\neq 0## then ##\mathbf{x}=(a_{12}^{-1},1)## does it, and if ##a_{22}=0 \wedge a_{12}= 0## then ##A=0## in which case there is nothing to do. But you do not need to make the cases, it can already be seen from the equation for ##f(\mathbf{x})##, even without extending the domain.

Would Liouville work? O.k. it is shooting sparrows with canons, but ...
 
  • #56
fresh_42 said:
If ##a_{22}\neq 0## then ##\mathbf{x}=(0,1)## is positive...
I was thinking of ##a_{2,2} \lt 0## so ##f(\mathbf e_2)\lt 0##...

the fact that
##\beta\big(\beta a_{2,2} + a_{1,2}\big) ##
equals 0 with ##\beta =0## and the sign changes in some small ball around zero had nice analytic feel to it since it basically becomes ##\beta\cdot a_{1,2}## in that neighborhood.

fresh_42 said:
Would Liouville work? O.k. it is shooting sparrows with canons, but ...

I originally had a mind to lift something from complex analysis since ##\mathbb R^2## is a bit like ##\mathbb C## though I didn't see holomorphicity. There maybe be some cannons that can be used though,
 
  • #57
I'm tired and want to turn in. Here's a go at Problem 3:

We are given:

\begin{align*}
z(t) \leq C + L \left| \int_{t_0}^t z (t_1) dt_1 \right| .
\end{align*}

Note that because ##z(t)## is a non-negative function on ##[a,b]##, and since the above inequality is required to hold for all ##t \in [a,b]##, this is why we it is required that ##C \geq 0##.

First assume that ##t \geq t_0##, so we can write:

\begin{align*}
z(t) \leq C + L \int_{t_0}^t z (t_1) dt_1 .
\end{align*}Since ##L \geq 0## we may substitute the inequality into the RHS which gives a first iteration:

\begin{align*}
z(t) & \leq C+ L \int_{t_0}^t \left( C + L \int_{t_0}^{t_1} z (t_2) dt_2 \right) dt_1 \\
& = C + C L \int_{t_0}^t + L^2 \int_{t_0}^t \int_{t_0}^{t_1} z (t_2) dt_2 dt_1
\end{align*}

A second iteration gives:

\begin{align*}
z(t) & \leq C + C L \int_{t_0}^t + L^2 \int_{t_0}^t \int_{t_0}^{t_1} \left( C + L \int_{t_0}^{t_2} z (t_3) dt_3 \right) dt_2 dt_1 \\
& = C + C L \int_{t_0}^t + C L^2 \int_{t_0}^t \int_{t_0}^{t_1} dt_2 dt_1 + L^3 \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} z (t_3) dt_3 dt_2 dt_1
\end{align*}

An ##n##th iteration gives

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^n A^{(i)} + L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} z (t_{n+1}) dt_{n+1} dt_n dt_{n-1} \dots dt_1 \quad (*)
\end{align*}

where

\begin{align*}
A^{(i)} = L^i \int_{t_0}^t \int_{t_0}^{t_1} \cdots \int_{t_0}^{t_{i-1}} dt_{i} dt_{i-1} \cdots dt_1
\end{align*}

By construction, we have

\begin{align*}
t \leq t_n \leq t_{n-1} \leq \cdots \leq t_1 \leq t .
\end{align*}As ##z(t)## is non-negative real valued continuous function over the closed interval ##[a,b]## it attains its maximum value, denoted by ##z_\max##. We deal with the last term in (*):

\begin{align*}
& L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} z (t_{n+1}) dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& \leq z_\max L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& = \frac{z_\max L^{n+1}}{(n+1)!} \int_{t_0}^t \int_{t_0}^t \int_{t_0}^t \cdots \int_{t_0}^t dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& = \frac{z_\max L^{n+1}}{(n+1)!} (t - t_0)^{n+1} \\
& \leq \frac{z_\max L^{n+1}}{n!} (b - a)^{n+1}
\end{align*}

This tends to zero as ##n \rightarrow \infty##: let ##m## be the largest integer less than or equal to ##2x## and let ##n > m## then

\begin{align*}
\frac{x^n}{n!} & = \frac{x^m}{m!} \frac{x}{m+1} \frac{x}{m+2} \cdots \frac{x}{n} \\
& \leq \frac{x^m}{m!} \left( \frac{1}{2} \right)^{n - m}
\end{align*}

Since

\begin{align*}
\lim_{n \rightarrow \infty} \frac{x^m}{m!} \left( \frac{1}{2} \right)^{n - m}
\end{align*}

we have

\begin{align*}
\lim_{n \rightarrow \infty} \frac{x^n}{n!} = 0 .
\end{align*}

So as ##n \rightarrow \infty##

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^\infty A^{(i)}
\end{align*}

where

\begin{align*}
A^{(i)} & = \frac{1}{i!} L^i \int_{t_0}^t \int_{t_0}^t \cdots \int_{t_0}^t dt_{i} dt_{i-1} \cdots dt_1 \\
& = \frac{1}{i!} L^i (t - t_0)^i
\end{align*}

Finally we have

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^\infty A^{(i)} \\
& = C + C \sum_{i=1}^\infty \frac{1}{i!} L^i (t - t_0)^i \\
& = C e^{L (t - t_0)}
\end{align*}Now assume that ##t_0 > t##, then

\begin{align*}
z(t) \leq C+ L \left| \int_{t_0}^t z (t_1) dt_1 \right| = C + L \int_t^{t_0} z (t_1) dt_1
\end{align*}

and giving the same argument again we obtain

\begin{align*}
z(t) \leq C e^{L (t_0 - t)} = C e^{L |t - t_0|} .
\end{align*}
 
Last edited:
  • Like
Likes benorin and fresh_42
  • #58
julian said:
I'm tired and want to turn in. Here's a go at Problem 3:

We are given:

\begin{align*}
z(t) \leq C + L \left| \int_{t_0}^t z (t_1) dt_1 \right| .
\end{align*}

Note that because ##z(t)## is a non-negative function on ##[a,b]##, and since the above inequality is required to hold for all ##t \in [a,b]##, this is why we it is required that ##C \geq 0##.

First assume that ##t \geq t_0##, so we can write:

\begin{align*}
z(t) \leq C + L \int_{t_0}^t z (t_1) dt_1 .
\end{align*}Since ##L \geq 0## we may substitute the inequality into the RHS which gives a first iteration:

\begin{align*}
z(t) & \leq C+ L \int_{t_0}^t \left( C + L \int_{t_0}^{t_1} z (t_2) dt_2 \right) dt_1 \\
& = C + C L \int_{t_0}^t + L^2 \int_{t_0}^t \int_{t_0}^{t_1} z (t_2) dt_2 dt_1
\end{align*}

A second iteration gives:

\begin{align*}
z(t) & \leq C + C L \int_{t_0}^t + L^2 \int_{t_0}^t \int_{t_0}^{t_1} \left( C + L \int_{t_0}^{t_2} z (t_3) dt_3 \right) dt_2 dt_1 \\
& = C + C L \int_{t_0}^t + C L^2 \int_{t_0}^t \int_{t_0}^{t_1} dt_2 dt_1 + L^3 \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} z (t_3) dt_3 dt_2 dt_1
\end{align*}

An ##n##th iteration gives

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^n A^{(i)} + L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} z (t_{n+1}) dt_{n+1} dt_n dt_{n-1} \dots dt_1 \quad (*)
\end{align*}

where

\begin{align*}
A^{(i)} = L^i \int_{t_0}^t \int_{t_0}^{t_1} \cdots \int_{t_0}^{t_{i-1}} dt_{i} dt_{i-1} \cdots dt_1
\end{align*}

By construction, we have

\begin{align*}
t \leq t_n \leq t_{n-1} \leq \cdots \leq t_1 \leq t .
\end{align*}As ##z(t)## is real valued continuous function over the closed interval ##[a,b]## it attains its maximum value, denoted by ##z_\max##. We deal with the last term in (*):

\begin{align*}
& L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} z (t_{n+1}) dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& < z_\max L^{n+1} \int_{t_0}^t \int_{t_0}^{t_1} \int_{t_0}^{t_2} \cdots \int_{t_0}^{t_n} dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& = \frac{z_\max L^{n+1}}{n!} \int_{t_0}^t \int_{t_0}^t \int_{t_0}^t \cdots \int_{t_0}^t dt_{n+1} dt_n dt_{n-1} \dots dt_1 \\
& = \frac{z_\max L^{n+1}}{n!} (t - t_0)^n \\
& \leq \frac{z_\max L^{n+1}}{n!} (b - a)^n
\end{align*}
The first ##<## should be ##\leq##.
This tends to zero as ##n \rightarrow \infty##: let ##m## be the largest integer less than or equal to ##2x## and let ##n > m## then

\begin{align*}
\frac{x^n}{n!} & = \frac{x^m}{m!} \frac{x}{m+1} \frac{x}{m+2} \cdots \frac{x}{n} \\
& \leq \frac{x^m}{m!} \left( \frac{1}{2} \right)^{n - m}
\end{align*}

Since

\begin{align*}
\lim_{n \rightarrow \infty} \frac{x^m}{m!} \left( \frac{1}{2} \right)^{n - m}
\end{align*}

we have

\begin{align*}
\lim_{n \rightarrow \infty} \frac{x^n}{n!} = 0 .
\end{align*}

So as ##n \rightarrow \infty##

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^\infty A^{(i)}
\end{align*}

where

\begin{align*}
A^{(i)} & = \frac{1}{i!} L^i \int_{t_0}^t \int_{t_0}^t \cdots \int_{t_0}^t dt_{i} dt_{i-1} \cdots dt_1 \\
& = \frac{1}{i!} L^i (t - t_0)^i
\end{align*}
A remark that this follows from the convergence of the Taylor series of the exponential function would had been sufficient. Instead you should have proven the iterated integral which you used twice, or at least mentioned that it is the volume formula for parallelepipeds.
Finally we have

\begin{align*}
z(t) & \leq C + C \sum_{i=1}^\infty A^{(i)} \\
& = C + C \sum_{i=1}^\infty \frac{1}{i!} L^i (t - t_0)^i \\
& = C e^{L (t - t_0)}
\end{align*}Now assume that ##t_0 > t##, then

\begin{align*}
z(t) \leq C+ L \left| \int_{t_0}^t z (t_1) dt_1 \right| = C + L \int_t^{t_0} z (t_1) dt_1
\end{align*}

and giving the same argument again we obtain

\begin{align*}
z(t) \leq C e^{L (t_0 - t)} = C e^{L |t - t_0|} .
\end{align*}
Well done. You could have avoided the entire iteration, if you had used a function ##F## and considered ##\dfrac{F'}{F}\leq L##. This is as good as the exponential function and shifts the iteration into the derivative.

The statement is called Grönwall Lemma.
 
Last edited:
  • #59
fresh_42 said:
Imagine a thread with this topic placed in our Topology and Analysis forum rather than in the Linear And Abstract Algebra forum where it perfectly fits without being moved. The first sentences could be:
The orthogonal real matrices build a compact subset of the Euclidean real space, because ... Now we define a continuous function ...

The present formulation of Problem 5 makes it look like the path ##x## is also prescribed, while I believe you want the solver to prove that such a path ##x## exists and is unique for a prescribed ##A## and a prescribed initial condition. This (and the typo that was already corrected) was what I hinted at.
 
  • #60
S.G. Janssens said:
The present formulation of Problem 5 makes it look like the path ##x## is also prescribed, while I believe you want the solver to prove that such a path ##x## exists and is unique for a prescribed ##A## and a prescribed initial condition. This (and the typo that was already corrected) was what I hinted at.
Yes.

I think that quotation and problem reference (#5) do not match here:
https://www.physicsforums.com/threads/math-challenge-january-2021.997970/#post-6438289
which was my mistake.

The problem means: Solve the differential equation system ##\dot x=Ax## and show that ##x(0)=x_0## makes the solution unique. (I added this to the problem, thanks.)
 
  • #61
fresh_42 said:
1. Let ##A\in \mathbb{M}_{m,n}(\mathbb{R})## and ##b\in \mathbb{R}^m##. Then exactly one of the following two statements is true:
  • ##Ax=b\, , \,x\geq 0 \,,## is solvable for a ##x\in \mathbb{R}^n.##
  • ##A^\tau y\leq 0\, , \,b^\tau y> 0\,,## is solvable for some ##y\in \mathbb{R}^m.##
The ordering is meant componentwise.
Lemma 1. ##\mathbf{x\geq 0}\iff (\forall \mathbf{y\geq 0})\,\,\mathbf{x^\top y}\geq 0.##
The latter implies the former since one can choose ##\mathbf{y=e}_i##.
Corollary. ##\; \mathbf{x\leq 0}\iff (\forall \mathbf{y\geq 0})\,\,\mathbf{x^\top y}\leq 0.##
This follows by substituting ##\mathbf{-x}## for ##\mathbf{x}##.

Lemma 2. ##(\forall \mathbf{x}\in\mathbb{R}^n)\,\, A\mathbf{x=b}\iff(\exists\mathbf{w}\in\mathbb{R}^n)\,\,\mathbf{x}=A^+\mathbf{b}+(I-A^+A)\mathbf{w}.##
##A^+## denotes the (possibly nonunique) Moore-Penrose pseudo-inverse of ##A##.

Lemma 3. The map ##\mathbf{v}^\top:V\rightarrow \mathbb{R}## for a nonzero vector ##\mathbf{v}## is surjective.

We handle the case ##\mathbf{b\neq 0}##, as the case for ##\mathbf{b=0}## can be verified trivially.

We have the two statements $$\begin{align} & \quad(\exists\mathbf{x\geq 0}\in\mathbb{R}^n)\,\,A\mathbf{x}=\mathbf{b} \\ & \quad (\exists\mathbf{y}\in\mathbb{R}^m)\,\, A^\top\mathbf{y\leq 0},\,\mathbf{b}^\top\mathbf{y}>0\end{align}$$ To prove that exactly one of these two statements is true, we prove that the negation of ##(2)## is equivalent to ##(1)##. The negation of ##(2)##, which we will call ##\neg(2)##, has the following expression: $$(\forall\mathbf{y}\in\mathbb{R}^m)\,\, A^\top\mathbf{y\leq 0}\Rightarrow \mathbf{b^\top y}\leq 0.$$ We proceed by proving ##\neg(2)\Rightarrow(1)## and ##(1)\Rightarrow\neg(2)##:
First, we will rewrite ##\neg(2)## in terms of preimages of sets: $$(A^\top)^{-1}(\mathbb{R}^n_{\leq 0})\subseteq(\mathbf{b}^\top)^{-1}(\mathbb{R}_{\leq0})$$ By Lemma 2, the preimage of any set under ##A^\top## splits thus: $$(A^\top)^{-1}(\mathbb{R}^n_{>0})=A^\top(\mathbb{R}^n_{>0})\oplus\mathrm{Im}\big(I-(A^+)^\top A^\top\big)$$ Since ##\mathbf{b}^\top## is surjective, it follows that ##(\mathbf{b}^\top\circ(\mathbf{b}^\top)^{-1})(S)=S##. Applying ##\mathbf{b^\top}## to both sides, we get $$(A\mathbf{b})^\top(\mathbb{R}^n_{\leq 0})+((I-AA^+)\mathbf{b})^\top(\mathbb{R})\subseteq\mathbb{R}_{\leq 0}$$ where ##+## here denotes the sumset or Minkowski sum. If ##((I-AA^+)\mathbf{b})^\top## is nonzero, then it is surjective by Lemma 3, which would imply that ##\mathbb{R}\subseteq\mathbb{R}_{\leq 0}##. Hence ##(I-AA^+)\mathbf{b}## is zero, and one has $$(A^+\mathbf{b})^\top(\mathbb{R}^n_{\leq 0})\subseteq\mathbb{R}_{\leq 0}.$$ This is equivalent to $$(A^+\mathbf{b})^\top(-1\cdot\mathbb{R}^n_{\geq 0})\subseteq-1\cdot\mathbb{R}_{\geq 0}$$ and by linearity we have $$(A^+\mathbf{b})^\top(\mathbb{R}^n_{\geq 0})\subseteq\mathbb{R}_{\geq 0}$$ in other words, ##A^+\mathbf{b}\geq\mathbf{0}## (Lemma 1). Moreover, since ##(I-AA^+)\mathbf{b}## is zero, or ##\mathbf{b}=AA^+\mathbf{b}##, it follows that if ##\mathbf{x}=A^+\mathbf{b}##, then ##A\mathbf{x}=AA^+\mathbf{b=b}.## We have proven that ##\neg(2)\Rightarrow(1).##

The proof in the other direction is much simpler. Suppose ##(\exists\mathbf{x}\geq 0\in\mathbb{R}^n)\,\,A\mathbf{x=b}.## By Lemma 1, it follows that if ##A^\top\mathbf{y\leq 0}##, then ##\mathbf{b^\top y}=\mathbf{x}^\top A^\top\mathbf{y}\leq 0.## We have proved that ##(1)\Rightarrow\neg(2)##. ##\blacksquare##
 
  • #62
suremarc said:
Lemma 2. ##(\forall \mathbf{x}\in\mathbb{R}^n)\,\, A\mathbf{x=b}\iff(\exists\mathbf{w}\in\mathbb{R}^n)\,\,\mathbf{x}=A^+\mathbf{b}+(I-A^+A)\mathbf{w}.##
##A^+## denotes the (possibly nonunique) Moore-Penrose pseudo-inverse of ##A##.
Do you have a reference for this? I'm not familiar with the Moore-Penrose pseudo-inverse.

And most important: why can you switch from preimages to the pseudo-inverse? And why is ##AA^\top(\mathbf{b})=\mathbf{b}\,?## You basically claim that ##A## is invertible on the subspace we need.
Lemma 3. The map ##\mathbf{v}^\top:V\rightarrow \mathbb{R}## for a nonzero vector ##\mathbf{v}## is surjective.
I assume you mean the map ##\mathbf{v}^\top(\mathbf{u})= \mathbf{v}^\tau\cdot \mathbf{u}##.

Edit: One of my solutions uses a hyperplane (strict separation theorem) and yours looks to me as a complicated version of it.
 
Last edited:
  • #63
fresh_42 said:
Do you have a reference for this? I'm not familiar with the Moore-Penrose pseudo-inverse.
The Wikipedia page, under Applications —> Obtaining all solutions of a linear system. ##A^+## essentially projects onto the image of ##A## and inverts it there.

Actually I think I have committed some hand-waving now that I look at my proof again, specifically in the expression for the preimage ##(A^\top)^{-1}(S)##..
 
  • #64
fresh_42 said:
And why is ##AA^\top(\mathbf{b}=\mathbf{b}##? You basically claim that ##A## is invertible on the subspace we need.
##AA^\top\mathbf{b}\neq\mathbf{b}##, but my claim was that ##AA^+\mathbf{b}=\mathbf{b}##.
 
  • #65
suremarc said:
##AA^\top\mathbf{b}\neq\mathbf{b}##, but my claim was that ##AA^+\mathbf{b}=\mathbf{b}##.
Sorry, was a typo. Why is it? It is no real inverse, so why is it for ##\mathbf{b}\,?##.
My proof goes ##\lnot (1) \Longrightarrow (2)##.
 
  • #66
fresh_42 said:
Sorry, was a typo. Why is it? It is no real inverse, so why is it for ##\mathbf{b}\,?##.
My proof goes ##\lnot (1) \Longrightarrow (2)##.
Well, I claim that ##(A^+\mathbf{b})^\top(\mathbf{R}^n_{\leq 0})+((I-AA^+)\mathbf{b})^\top(\mathbb{R}^m)\subseteq\mathbb{R}_{\leq 0}.## If ##((I-AA^+)\mathbf{b})\neq 0## then the second term is all of ##\mathbb{R}## since (in this case) nonzero linear functionals are surjective. Thus the sum of that set and the first term would also be all of ##\mathbb{R}##, but my original claim was that this is a subset of ##\mathbb{R}_{\leq 0}##, so this leads to a contradiction. Ergo, ##((I-AA^+)\mathbf{b})## must be zero.

edit: made a typo, it’s ##((I-AA^+)\mathbf{b})^\top(\mathbb{R}^m)##, not ##((I-AA^+)\mathbf{b})^\top(\mathbb{R})##
 
  • #67
suremarc said:
edit: made a typo, it’s ##((I-AA^+)\mathbf{b})^\top(\mathbb{R}^m)##, not ##((I-AA^+)\mathbf{b})^\top(\mathbb{R})##
And I think another one in the second equation where ##>## should be ##\leq##, I think.
 
  • #68
fresh_42 said:
And I think another one in the second equation where ##>## should be ##\leq##, I think.
Ah, yeah, you’re right.
I’m going to redo that section of the proof anyway, I don’t think the switch from preimage to pseudoinverse was justified now that I’m trying to work it out rigorously. D’oh.
 
  • #69
I think it is too complicated. A simple hyperplane would do (##\mathbb{R}_{\geq 0}^m## is a cone!), or the strong duality theorem. Your proof mimics the separation by a hyperplane I think.
 
  • Like
Likes suremarc
  • #70
I applied the hyperplans separation theorem, and things were pretty simple, surprisingly.

The negation of ##(1)## is ##(\forall x\geq 0\in\mathbb{R}^n)\,\,A\mathbf{x=b},## or in terms of sets, $$\mathbf{b}\notin A(\mathbb{R}^n_{\geq 0}).$$ ##A(\mathbb{R}^n_{\geq 0})## inherits the form of a closed cone from ##\mathbb{R}^n_{\geq 0}## due to the invariance of the property ##(\forall\mathbf{u,v}\in C)(\forall\alpha,\beta\in\mathbb{R}_{\geq 0}) \,\,\mathbf{\alpha u+\beta v}\in C## for a cone ##C##. We apply the hyperplane separation theorem for the convex sets ##{\mathbf{b}}## and ##A(\mathbb{R}^n_{\geq 0})##: $$(\exists\mathbf{y}\in\mathbb{R}^m-\{\mathbf{0}\})\,\,\mathbf{y^\top b}\notin\mathbf{y}^\top A(\mathbb{R}^n_{\geq 0})$$
Because ##\mathbb{R}^n_{\geq 0}## is a cone, ##\mathbf{y}^\top A(\mathbb{R}^n_{\geq 0})## is also a cone, which means that it is either ##\mathbb{R}_{\geq 0}##, ##\mathbb{R}_{\leq 0}##, or the zero cone. If it is ##\mathbb{R}_{\geq 0}##, replace ##\mathbf{y’=-y}## to get some ##\mathbf{y’}## for which ##\mathbf{y’^\top} A(\mathbb{R}^n_{\geq 0})\subseteq\mathbb{R}_{\leq 0}##. This implies that ##\mathbf{y’^\top b}\notin\mathbb{R}_{\leq 0}##: in other words, ##\mathbf{b^\top y’}>0## (one half of ##(2)##). Finally, note that ##\mathbf{y’}^\top A(\mathbb{R}^n_{\geq 0})=(A^\top\mathbf{y’})^\top(\mathbb{R}^n_{\geq 0})\subseteq\mathbb{R}_{\leq 0}##, which implies that ##A^\top\mathbf{y’\leq 0}.## We have proven ##(2)## from ##\neg(1).##

To prove the other direction, observe that ##(\exists\mathbf{y}\in\mathbb{R}^m)\,\,A^\top\mathbf{y}\leq 0\land\mathbf{b^\top y}>0## implies the following: $$(\exists\mathbf{y}\in\mathbb{R}^m)(\forall\mathbf{x}\in\mathbb{R}^n:\,A\mathbf{x=b})\,\,\mathbf{y}^\top A\mathbf{x}=\mathbf{b^\top y}=\mathbf{x}^\top(A^\top\mathbf{y})>0,$$ implying that ##\mathbf{x}\not\geq 0## since ##A^\top\mathbf{y}\leq 0## but ##x^\top(A^\top\mathbf{y})>0##.
 
  • Like
Likes fresh_42
  • #71
fresh_42 said:
5. Let ##A\in \mathbb{M}(n,\mathbb{R})## be a real square matrix. Show that there is a parameterized path ##x\, : \,\mathbb{R}\longrightarrow \mathbb{R}^n## as solution of the differential equation ##\dot x(t)=Ax(t)## which is unique for any given initial condition ##x(t_0)=x_0.##
I didn't notice there was one more problem, so here's my go at it.

Claim. ##\mathbf{x}(t)## is analytic on ##\mathbb{R}##.
Proof. From ##\dot{\mathbf{x}}(t)=A\mathbf{x}(t)##, it immediately follows that ##\mathbf{x}^{(k)}(t)=A^k\,\mathbf{x}(t)##. Introduce the operator norm ##\|A\|=\mathrm{sup}_{\mathbf{x}\in\mathbb{R}^n}\frac{\|A\mathbf{x}\|}{\|\mathbf{x}\|}##, which is a subadditive and submultiplicative norm. It follows that $$e^{(t-t_0)A}\,\mathbf{x}(t_0)=\sum_{k=0}^\infty \frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k$$ exists for all ##t,t_0\in\mathbb{R}##, with convergence of ##e^{(t-t_0)A}## provided by the operator norm, and convergence of the RHS provided by the standard norm on ##\mathbb{R}^n##, respectively (note that ##\|\mathbf{x}^{(k)}(t_0)\|\leq\|A\|^k\|\mathbf{x}(t_0)\|##). One can check that this is a solution of the differential equation ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)##.

We now prove that ##\mathbf{x}(t)=e^{(t-t_0)A}\,\mathbf{x}(t_0)##: to be more precise, $$(\forall t,t_0\in\mathbb{R})\lim_{N\rightarrow\infty}\Bigg(\mathbf{x}(t)-\sum_{k=0}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k\Bigg)=0.$$ By Taylor's theorem, one has that ##\mathbf{x}(t)=\Big(\sum_{k=0}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k\Big)+\mathbf{r}_N(t)##, where $$(\exists\xi\in\mathbb{R})\,\,\mathbf{r}_N(t)=\frac{\mathbf{x}^{(N+1)}(\xi)}{(N+1)!}(t-t_0)^{N+1}$$ for some ##\xi## between ##t_0## and ##t##. Since ##\mathbf{x}^{(N+1)}(\xi)=A^{N+1}\,\mathbf{x}(\xi)##, we may take the operator norm to get $$\|\mathbf{r}_N(t)\|\leq\frac{\|A\|^{(N+1)}\|\mathbf{x}(\xi)\|}{(N+1)!}|t-t_0|^{N+1}$$ and from this one clearly sees that ##\lim_{N\rightarrow\infty}\mathbf{r}_N(t)=0##. Thus the series for ##e^{(t-t_0)A}\,\mathbf{x}(t_0)## converges pointwise to ##\mathbf{x}(t)##.

I wonder if Taylor's theorem applies in the infinite-dimensional setting with a bounded operator. If so, I think this result should generalize.

Edit: I just realized you have to modify the formula for the remainder term slightly. To be precise, you may need a different ##\xi## for each component of ##\mathbf{r}_N(t)##, although the theorem still holds.
 
  • #72
suremarc said:
I didn't notice there was one more problem, so here's my go at it.

Claim. ##\mathbf{x}(t)## is analytic on ##\mathbb{R}##.
Proof. From ##\dot{\mathbf{x}}(t)=A\mathbf{x}(t)##, it immediately follows that ##\mathbf{x}^{(k)}(t)=A^k\,\mathbf{x}(t)##. Introduce the operator norm ##\|A\|=\mathrm{sup}_{\mathbf{x}\in\mathbb{R}^n}\frac{\|A\mathbf{x}\|}{\|\mathbf{x}\|}##, which is a subadditive and submultiplicative norm. It follows that $$e^{(t-t_0)A}\,\mathbf{x}(t_0)=\sum_{k=0}^\infty \frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k$$ exists for all ##t,t_0\in\mathbb{R}##, with convergence of ##e^{(t-t_0)A}## provided by the operator norm, and convergence of the RHS provided by the standard norm on ##\mathbb{R}^n##, respectively (note that ##\|\mathbf{x}^{(k)}(t_0)\|\leq\|A\|^k\|\mathbf{x}(t_0)\|##). One can check that this is a solution of the differential equation ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)##.

We now prove that ##\mathbf{x}(t)=e^{(t-t_0)A}\,\mathbf{x}(t_0)##: to be more precise, $$(\forall t,t_0\in\mathbb{R})\lim_{N\rightarrow\infty}\Bigg(\mathbf{x}(t)-\sum_{k=0}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k\Bigg)=0.$$ By Taylor's theorem, one has that ##\mathbf{x}(t)=\Big(\sum_{k=0}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k\Big)+\mathbf{r}_N(t)##, where $$(\exists\xi\in\mathbb{R})\,\,\mathbf{r}_N(t)=\frac{\mathbf{x}^{(N+1)}(\xi)}{(N+1)!}(t-t_0)^{N+1}$$ for some ##\xi## between ##t_0## and ##t##. Since ##\mathbf{x}^{(N+1)}(\xi)=A^{N+1}\,\mathbf{x}(\xi)##, we may take the operator norm to get $$\|\mathbf{r}_N(t)\|\leq\frac{\|A\|^{(N+1)}\|\mathbf{x}(\xi)\|}{(N+1)!}|t-t_0|^{N+1}$$ and from this one clearly sees that ##\lim_{N\rightarrow\infty}\mathbf{r}_N(t)=0##. Thus the series for ##e^{(t-t_0)A}\,\mathbf{x}(t_0)## converges pointwise to ##\mathbf{x}(t)##.

I wonder if Taylor's theorem applies in the infinite-dimensional setting with a bounded operator. If so, I think this result should generalize.

Edit: I just realized you have to modify the formula for the remainder term slightly. To be precise, you may need a different ##\xi## for each component of ##\mathbf{r}_N(t)##, although the theorem still holds.
I would have been satisfied by a simple differentiation to prove existence and check ##x(t_0)=x_0##, which is trivial, but should have been mentioned, because for any ##x_0## there is a unique flow is the core statement. However, it's nice to have an argument for convergence.

But what about uniqueness? You cannot conclude from the unique convergence of your solution, that there cannot be another solution. The vector field might have branches. Uniqueness is crucial, especially for physicists!
 
  • #73
fresh_42 said:
I would have been satisfied by a simple differentiation to prove existence and check ##x(t_0)=x_0##, which is trivial, but should have been mentioned, because for any ##x_0## there is a unique flow is the core statement. However, it's nice to have an argument for convergence.

But what about uniqueness? You cannot conclude from the unique convergence of your solution, that there cannot be another solution. The vector field might have branches. Uniqueness is crucial, especially for physicists!
Doesn’t ##\lim_{N\rightarrow\infty}\mathbf{r}_N(t)=0## imply that any solution to ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)## with ##\mathbf{x}(t_0)=\mathbf{x}_0## is equal to the series? ##\mathbf{r}_N(t)## is just the difference ##\mathbf{x}(t)-\sum_{k=1}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k.##
 
  • #74
suremarc said:
Doesn’t ##\lim_{N\rightarrow\infty}\mathbf{r}_N(t)=0## imply that any solution to ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)## with ##\mathbf{x}(t_0)=\mathbf{x}_0## is equal to the series? ##\mathbf{r}_N(t)## is just the difference ##\mathbf{x}(t)-\sum_{k=1}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k.##
No. You could have a completely different solution. Differentiation loses constant terms, so maybe this constant term creates a different flow through the vector field. Let ##y(t)## be another solution of the differential equation, i.e. ##\dot y(t)=Ay(t)\, , \,y(t_0)=x_0.## Now why is ##x(t)=y(t)## for ##t>t_0\,?## Something like a repeller might happen at ##x_0##. I guess you would have noticed if you had used ##x_0## as required, and not simply assumed ##x(t_0)=?## in your proof. It is in fact the major part of this theorem, that the initial condition already determines the entire flow!
 
  • #75
fresh_42 said:
No. You could have a completely different solution. Differentiation loses constant terms, so maybe this constant term creates a different flow through the vector field. Let ##y(t)## be another solution of the differential equation, i.e. ##\dot y(t)=Ay(t)\, , \,y(t_0)=x_0.## Now why is ##x(t)=y(t)## for ##t>t_0\,?## Something like a repeller might happen at ##x_0##. I guess you would have noticed if you had used ##x_0## as required, and not simply assumed ##x(t_0)=?## in your proof. It is in fact the major part of this theorem, that the initial condition already determines the entire flow!
Sorry, but I don’t quite understand. If ##\mathbf{x}(t_0)=\mathbf{y}(t_0)=\mathbf{x}_0##, then ##\mathbf{x}^{(k)}(t_0)=\mathbf{y}^{(k)}(t_0)=A^k\,\mathbf{x}_0##, which gives rise to the Taylor series, which only depends on ##\mathbf{x}_0, t_0,## and ##t##. I used Taylor’s theorem to prove that any solution to ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)## with ##\mathbf{x}(t_0)=\mathbf{x}_0## converges pointwise to ##e^{(t-t_0)A}\,\mathbf{x}_0##, which includes both ##\mathbf{x}(t)## and ##\mathbf{y}(t)##. Unless I am missing something, this proves uniqueness of the solution I gave, because the difference between any arbitrary solution and my solution converges pointwise to 0.

Edit: corrected a typo; wrote ##\mathbf{x}_0## instead of ##A^k\,\mathbf{x}_0##.
 
  • #76
suremarc said:
Sorry, but I don’t quite understand. If ##\mathbf{x}(t_0)=\mathbf{y}(t_0)=\mathbf{x}_0##, then ##\mathbf{x}^{(k)}(t_0)=\mathbf{y}^{(k)}(t_0)=A^k\,\mathbf{x}_0##, which gives rise to the Taylor series, which only depends on ##\mathbf{x}_0, t_0,## and ##t##. I used Taylor’s theorem to prove that any solution to ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)## with ##\mathbf{x}(t_0)=\mathbf{x}_0## converges pointwise to ##e^{(t-t_0)A}\,\mathbf{x}_0##, which includes both ##\mathbf{x}(t)## and ##\mathbf{y}(t)##. Unless I am missing something, this proves uniqueness of the solution I gave, because the difference between any arbitrary solution and my solution converges pointwise to 0.

Edit: corrected a typo; wrote ##\mathbf{x}_0## instead of ##A^k\,\mathbf{x}_0##.
You did not prove that this series is the only possibility. You have proven: the Taylor series provides one solution, not all.
 
  • #77
fresh_42 said:
You did not prove that this series is the only possibility. You have proven: the Taylor series supplies 1 solution, not all.
The entire second half of the proof is dedicated to the proposition that any solution ##\mathbf{x}(t)## is equal to its Taylor series at ##t_0##, which only depends on ##\mathbf{x}_0, t_0,## and ##t##.
 
  • #78
suremarc said:
Sorry, but I don’t quite understand. If ##\mathbf{x}(t_0)=\mathbf{y}(t_0)=\mathbf{x}_0##, then ##\mathbf{x}^{(k)}(t_0)=\mathbf{y}^{(k)}(t_0)=A^k\,\mathbf{x}_0##, which gives rise to the Taylor series, which only depends on ##\mathbf{x}_0, t_0,## and ##t##. I used Taylor’s theorem to prove that any solution to ##\dot{\mathbf{x}}(t)=A\,\mathbf{x}(t)## with ##\mathbf{x}(t_0)=\mathbf{x}_0## converges pointwise to ##e^{(t-t_0)A}\,\mathbf{x}_0##, which includes both ##\mathbf{x}(t)## and ##\mathbf{y}(t)##. Unless I am missing something, this proves uniqueness of the solution I gave, because the difference between any arbitrary solution and my solution converges pointwise to 0.

Edit: corrected a typo; wrote ##\mathbf{x}_0## instead of ##A^k\,\mathbf{x}_0##.
My confusion comes from this part:
suremarc said:
We now prove that ##\mathbf{x}(t)=e^{(t-t_0)A}\,\mathbf{x}(t_0)##: to be more precise, $$(\forall t,t_0\in\mathbb{R})\lim_{N\rightarrow\infty}\Bigg(\mathbf{x}(t)-\sum_{k=0}^N\frac{\mathbf{x}^{(k)}(t_0)}{k!}(t-t_0)^k\Bigg)=0.$$
You prove that ##\mathbf{x}(t)## has a Taylor expansion by quoting Taylor's theorem? I didn't realize that you switched the roles of ##\mathbf{x}(t)##. In the first part you basically defined ##\mathbf{x}(t)## by the ´Taylor series of ##x_0e^{(t-t_0)A}##, and in the second part it is not this specific solution anymore but any solution. That ##\mathbf{x}(t)## is a solution is dug in the remainder term. You showed that the remainder term of any solution converges to the remainder term of the constructed solution, and all are named ##\mathbf{x}(t)##.

Beside my confusion - which I think has cleared up now - there is another question I have: The Taylor expansion is a local theorem. How can we know that the solution is a global one, since you assumed the existence of a Taylor expansion?
 
  • Like
Likes suremarc
  • #79
fresh_42 said:
Beside my confusion - which I think has cleared up now - there is another question I have: The Taylor expansion is a local theorem. How can we know that the solution is a global one, since you assumed the existence of a Taylor expansion?
Well, Taylor’s theorem says that for any ##t, t_0## there exists a ##\xi## between ##t## and ##t_0## such that ##\mathbf{r}_k(t)=\frac{\mathbf{x}^{(k+1)}(\xi)}{(k+1)!}(t-t_0)^{k+1}##. Actually, this only applies to the one-dimensional case, so I erred there, but you can just find the ##\xi## for each coordinate of ##\mathbf{x}##. Anyway, this means that convergence of ##\mathbf{r}_N## is only pointwise, not uniform, since it depends on ##t##. So obviously for large values of ##t-t_0## the convergence is slower, but it still converges.

Edit: corrected a typo where I put ##t_0## instead of ##\xi##
 
  • #80
suremarc said:
Well, Taylor’s theorem says that for any ##t, t_0## there exists a ##\xi## between ##t## and ##t_0## such that ##\mathbf{r}_k(t)=\frac{\mathbf{x}^{(k+1)}(\xi)}{(k+1)!}(t-t_0)^{k+1}##. Actually, this only applies to the one-dimensional case, so I erred there, but you can just find the ##\xi## for each coordinate of ##\mathbf{x}##. Anyway, this means that convergence of ##\mathbf{r}_N## is only pointwise, not uniform, since it depends on ##t##. So obviously for large values of ##t-t_0## the convergence is slower, but it still converges.

Edit: corrected a typo where I put ##t_0## instead of ##\xi##
My solution used the inverse group element. Anyway, this tiny statement is very important:

The matrix exponential function ##tA\longmapsto e^{At}## is the unique solution of the matrix differential equation
$$
\dot X(t)=AX(t)\, , \,X(0)=1_{\mathbb{M}(n,\mathbb{R})}\; , \;X\, : \,\mathbb{R}\longrightarrow \mathbb{R}^{n\times n}
$$

Say anyone there can't be learned something here :biggrin:Edit: The other matrix problem you solved (linear programming) is called Farkas Lemma.
 
  • Like
Likes suremarc
  • #81
Since all the uni-level questions have been solved, let me set a problem:

Let ##r/s## be any positive or negative rational number, prove that ##e^{r/s}## is an irrational number. Hint: make use of the polynomial:

\begin{align*}
P_n (x) = \frac{x^n (1-x)^n}{n!} .
\end{align*}
 
Last edited:
  • #82
I am attempting this problem, but I'd like some sort of hint, or indication I'm going down the right track.

Every 5^n will end in 25, hence every 2^m that satisfies the equation must end in 32. This occurs when m = 5 + 20k, where k is a positive integer.

I checked this for a bunch of values and only found 32 and 25. I reckon these are the only two, but I don't know how to prove it.

I need to prove that 2^{5+20k} - 7 only has prime factor 5. Does it suffice to try a bunch of divisibility tests on them and see where that takes me?
 
  • #83
lekh2003 said:
I am attempting this problem, but I'd like some sort of hint, or indication I'm going down the right track.

Every 5^n will end in 25, hence every 2^m that satisfies the equation must end in 32. This occurs when m = 5 + 20k, where k is a positive integer.

I checked this for a bunch of values and only found 32 and 25. I reckon these are the only two, but I don't know how to prove it.

I need to prove that 2^{5+20k} - 7 only has prime factor 5. Does it suffice to try a bunch of divisibility tests on them and see where that takes me?
The basic idea to examine remainders is correct. But it's not so easy.

You should check the solutions up to ##m=5## manually. Then consider the equation modulo 64 and conclude how ##n## must look like. Then take another modulus (17) and so on.
 
  • #84
julian said:
Let ##r/s## be any positive or negative rational number, prove that ##e^{r/s}## is an irrational number. Hint: make use of the polynomial:

\begin{align*}
P_n (x) = \frac{x^n (1-x)^n}{n!} .
\end{align*}

Further hint: There is an immediate simplification that can be made to the problem. You then move on to making use of the polynomial ##P_n (x) = x^n(1-x)^n / n!## and its derivatives. In the proof you will need to introduce another function, what function springs to mind?
 
Last edited:
  • #85
Partial answer for question 11

11.1. There is a number which occurs at least five times among the differences between two numbers of ##A##.

Right,
Yes, there must be a number which occurs at least 5 times among the differences.

Proof: If we consider the pair ##(x, y)## for difference between 2 numbers, we need not consider the reverse pair ##(y, x)## since the 2 pairs have the same absolute value for the difference. In other words, it is sufficient to consider pairs ##(x, y)## such that ##y - x > 0##. Since there are 32 distinct numbers in ##A##, there will be ##_{32}C_{2} = 496## such pairs. Let ##D = (d_1, d_2, ..., d_{496})## denote the list of differences of those pairs.

All numbers in ##A## are integers in the range ##[1, 112)##, so the differences must be integers in the range ##[1, 110]##, with the maximum difference of 110 achieved for the pair ##(1, 111)##. In other words, there are utmost 111 distinct values possible for values in ##D##. If each of the 110 values occurs utmost 4 times in ##D##, utmost 440 elements of ##D## get assigned valid values, leaving ##56 > 0## elements with no value assigned (with a smaller limit on number of repeated values, even more elements of ##D## will be left without values assigned to them). It must therefore be the case that at least one value repeats more than 4 times (i.e. repeats at least 5 times) in ##D##.
 
  • #86
Q11.2. There is a number that occurs at least six times among the differences between two numbers of ##A##.

Yes, there must be a number which occurs at least 6 times among the differences. We can prove this by showing that if no number can occur more than 5 times among differences, then a contradiction would arise.

Let ##x_1 < x_2 < \ldots < x_{31} < x_{32}## be the ordered list of 32 numbers belonging to set ##A##. The ##x_{i}##'s can be viewed as 32 distinct points on the X-axis. There are 31 pairs of adjacent points, namely ##(x_{i}, x_{i+1})##, ##1 \leq i \leq 31##. The differences corresponding to these pairs are defined by ##d_{i} = x_{i+1} - x_{i}##. By definition of ##A##, all ##x_{i}##'s are in integers in the range ##[1, 111]## and therefore ##d_{i}##'s must be integers in the range ##[1, 110]##.

Consider the sum of the differences of these 31 pairs, ##S = \sum_{i=1}^{31} d_{i}##. ##S## gets its minimum value ##S_{\text{min}}## when the values of ##d_{i}##'s are minimized subject to constraints imposed by the problem statement. If differences can take only integer values in ##[1, 110]## and no number can occur more than 5 times among differences, then ##S_{\text{min}} = 5 \times (1 + 2 + 3 + 4 + 5 + 6) + 7 = 112##. Here we use the fact that the 31 differences need to have at least 7 distinct values given that each value can occur at most 5 times and that the 6 smallest allowed values, namely 1 to 6, are the ones that should repeat the most for the sum to be minimized.

Now ##S = \sum_{i=1}^{31} d_{i} = 112 \Rightarrow \sum_{i=1}^{31} (x_{i+1} - x_{i}) = 112 \Rightarrow (x_{32}-x_{1}) = 112##. But this is a contradiction because since ##1 \leq x_{1} < x_{32} \leq 111##, we must have ##1 \leq (x_{32}-x_{1}) \leq 110##. The contradiction arises because by limiting the number of times the same value can occur among differences, we are forcing the sum of differences (and thereby the differences corresponding to non-adjacent points) to take larger values that what would be possible if smaller values were allowed to repeat more times among the differences. And by limiting to 5, we are violating the constraint on the range of ##x_{i}##'s. Hence, it must be the case that at least one value occurs more than 5 times, i.e. at least 6 times among the differences.
 
  • Like
Likes fresh_42
  • #87
Observation 1: ##2\sqrt{n} > \sqrt{n-1} + \sqrt{n+1} \ \, \forall{n \geq 1} ##
Proof: ##(\sqrt{n-1} + \sqrt{n+1})^2 = (n-1) + (n+1) + 2\sqrt{n^2 - 1} = 2n + 2\sqrt{n^2 - 1} < 2n + 2n##
Applying square root on both sides, we see that ##{\sqrt{n-1} + \sqrt{n+1}} < \sqrt{4n} \equiv 2 \sqrt{n}##

It is easy to see from definition of ##\lfloor{a}\rfloor## that if ##x < y##, then ##\lfloor{x}\rfloor \leq \lfloor{y}\rfloor## (since the smallest integer greater than or equal to ##y## will be greater than ##x##). Combining this property with observation 1 leads leads to:
##\lfloor{\sqrt{n-1} + \sqrt{n+1}}\rfloor \leq \lfloor 2\sqrt{n} \rfloor \ \forall{n \geq 1} ##

Observation 2: ##\sqrt{n-1} + \sqrt{n+1} > 2 \sqrt{n} - 1 \ \, \forall{n \geq 1} ##
Proof: ##(\sqrt{n-1} + \sqrt{n+1})^2 = 2n + 2\sqrt{(n - 1)(n+1)} > 2n + 2(n-1)## since ##(n-1)(n+1) > (n-1)^2 \ \forall n \geq 1##. This implies ##(\sqrt{n-1} + \sqrt{n+1})^2 > 4n - 2 > (2\sqrt{n} - 1)^2##, since ##(2\sqrt{n} - 1)^2 = 4n - (4\sqrt{n} - 1)## and ##(4\sqrt{n} - 1) > 2## when ##n \geq 1##. Applying square root on both sides, we get ##\sqrt{n-1} + \sqrt{n+1} > 2\sqrt{n} - 1 \ \, \forall{n \geq 1}##

Combining observations (1) and (2), we get ##(2 \sqrt{n} - 1) < (\sqrt{n-1} + \sqrt{n+1}) < 2\sqrt{n}## (Eq. 1)

Suppose ##\lfloor \sqrt{n} \rfloor = m## for some ##m \in \mathbb{N}##. Then ##m^2## must be the largest perfect square less than or equal to ##n## and ##n = m^2 + a## for some integer ##a## such that ##0 \leq a \leq 2m##. We get the upper bound of ##a## by noting that if ##a## is higher even by 1, we get ##n = m^2 + 2m +1 = (m+1)^2##, making ##(m+1)^2## the largest perfect square less than or equal to ##n##, contradicting the earlier assumption.

Equivalently, we must have ##\sqrt{n} = m + \epsilon## where ##0 \leq \epsilon < 1##, with ##\epsilon = 0## corresponding to ##a = 0## and ##\epsilon## increasing as ##a## increases.

We now consider 3 different cases that cover all possible values of ##n## such that ##\lfloor \sqrt{n} \rfloor = m## and calculate ##\lfloor 2\sqrt{n} \rfloor - \lfloor{\sqrt{n-1} + \sqrt{n+1}}\rfloor## in each.

Define ##Y = \sqrt{n-1} + \sqrt{n+1}##.

Case 1: ##\epsilon = 0 \equiv a = 0##
Here, ##\sqrt{n} = m \Rightarrow 2\sqrt{n} = 2m \Rightarrow \lfloor 2\sqrt{n} \rfloor = 2m##.
Using (Eq. 1) we get ##2m - 1 < Y < 2m \Rightarrow \lfloor Y \rfloor = 2m - 1 \Rightarrow \lfloor 2\sqrt{n} \rfloor - \lfloor Y \rfloor = 2m - (2m - 1) = 1##.

Case 2: ##0 < \epsilon < 0.5##
This corresponds to ##1 \leq a \leq m##, since if ##a \geq (m+1)##, we get ##n \geq (m^2 + m + 1) > (m + 0.5)^2 \Rightarrow \sqrt{n} > m + 0.5##.

##2m < 2\sqrt{n} = (2m + 2\epsilon) < (2m+1) \Rightarrow \lfloor 2\sqrt{n} \rfloor = 2m##
Since ##a \geq 1## in this case, we get ##Y \geq \sqrt{(m^2+1-1)} + \sqrt{m^2+1+1} > 2m##. Combining with (Eq. 1), or alternatively by using the fact that ##\epsilon < 0.5##, we get ##2m < Y < 2\sqrt{n} < 2m+1 \Rightarrow \lfloor Y \rfloor = 2m##.

Hence, ##\lfloor 2\sqrt{n} \rfloor - \lfloor Y \rfloor = 2m - 2m = 0## in this case.

Case 3: ##0.5 < \epsilon < 1##
This corresponds to ##(m+1) \leq a \leq 2m##, since if ##a \geq (2m+1)##, we get ##n \geq (m^2 + 2m + 1) \Rightarrow \sqrt{n} \geq (m + 1)## which contradicts ##\lfloor \sqrt{n} \rfloor = m##. Also note that we consider ##\epsilon > 0.5## rather than ##\epsilon \geq 0.5## since ##\epsilon = 0.5## does not correspond to an integer value of ##n##.

In this case, ##2(m+0.5) < 2\sqrt{n} = 2(m + 2\epsilon) < 2(m+1) \Rightarrow \lfloor 2\sqrt{n} \rfloor = 2m + 1##.
Since ##a \geq (m+1)## in this case, we get ##Y \geq \sqrt{(m^2+m+1-1)} + \sqrt{m^2+m+1+1}##.

##Y^2 \geq \left(m^2 + m + m^2 + m + 2 + 2\sqrt{(m^2 + m)(m^2 + m + 2)}\right) > (2m^2 + 2m + 2 + 2(m^2 + m))##, where we use the fact that ##(m^2 + m)(m^2 + m + 2) > (m^2 + m)^2## for ##m > 0##. This yields ##Y^2 > (4m^2 + 4m + 2) > (2m + 1)^2 \Rightarrow Y > (2m+1)##.

Combining this inequality with (Eq. 1), we get ##(2m+1) < Y < 2\sqrt{n} < 2m+2 \Rightarrow \lfloor Y \rfloor = 2m+1##.

Hence, ##\lfloor 2\sqrt{n} \rfloor - \lfloor Y \rfloor = (2m +1) - (2m +1)= 0## in this case too.

Answer: Combining the observations from the 3 cases, we see that ##f(n) = 1## if ##n## is a perfect square and ##f(n) = 0## otherwise (i.e. if ##n## is natural number that is not a perfect square).
 

Similar threads

2
Replies
61
Views
9K
2
Replies
61
Views
11K
2
Replies
93
Views
14K
Replies
42
Views
10K
2
Replies
60
Views
11K
3
Replies
114
Views
10K
2
Replies
67
Views
11K
2
Replies
56
Views
10K
3
Replies
100
Views
11K
3
Replies
102
Views
10K
Back
Top