Is there a typo in this theorem in Apostol or not?

zenterix · Feb 3, 2025

zenterix · Feb 3, 2025

I mean, maybe there isn't a typo.

If ##B## is ##n\times n## then don't we have ##n## initial value problems?

##F(t)## is said to be ##n\times n##. Is not each column of ##F## a solution to one of the ##n## IVPs?

Just a little later in the book there is the following theorem

Now, ##B## is ##n\times 1## but the problem is also called an "initial value problem".

So, at a minimum, it seems the language used in the theorems is confusing. Is it also incorrect? How can ##B## have dimensions of either ##n\times n## or ##n\times 1## and either way the problem of ##Y'=AY## with ##Y(0)=B## is considered "an initial value problem".

fresh_42 · Feb 3, 2025

It is no typo. ##A,B## are ##n\times n##-matrices. The function ##F(t)## is the parameterization of a curve through a space of these matrices.

zenterix · Feb 3, 2025

I never really understood why there are these two separate theorems in the book.

fresh_42 · Feb 3, 2025

It's a bit unfortunate to use the same letter ##B## for a matrix and a vector, but the principles are the same, only that ##F(t)## is a curve in a matrix space and ##Y(t)## is a curve in a vector space.

zenterix · Feb 3, 2025

Okay, let me try to go through why there are the two theorems in steps here.

Ok, so ##A## is definitely ##n\times n##. Is ##F## ##n\times 1##?

Right after the above, there is a proof that for any ##n\times n## matrix ##A## and any scalar ##t## we have ##e^{tA}e^{-tA}=I##, proving that ##e^{tA}## is nonsingular.

Then we have the theorem

which, as far as I understand it, is like having ##n## initial value problems.

Then there is the following result

and

So here they have a vector ##Y##. Was ##F## a vector or a matrix, previously, in ##F'=AF##?

Finally there is the theorem that I showed in the OP

fresh_42 · Feb 3, 2025

Here is an article about it:
https://www.physicsforums.com/insig...onship-between-integration-and-eulers-number/

It does not replace your book since it contains no proofs, but maybe it can give you some insights.

fresh_42 · Feb 3, 2025

##F## is no vector, it's a curve. E.g. ##F\, : \,[0,1]\longrightarrow \mathbb{M}(2,\mathbb{R})## defined by
$$
F(t)=\begin{pmatrix}e^t&ct\\0&e^{-t}\end{pmatrix}
$$
would be such a function, with ##F(0)=B=I.##

zenterix · Feb 3, 2025

Okay, but why are there the two theorems? What makes them so different?

fresh_42 · Feb 3, 2025

zenterix said:

Okay, but why are there the two theorems? What makes them so different?

I have no idea, and I haven't the book. I assume the proofs are very similar. Or you use the matrix theorem and simply a vector to the equation.

FactChecker · Feb 3, 2025

zenterix said:

Okay, but why are there the two theorems? What makes them so different?

The proof of the second theorem should give you some idea of why it is or is not a consequence of the first theorem.

mathwonk · Feb 3, 2025

Here's my take on it, without having the book at hand. (I sometimes wish I hadn't given all these great books away.)

an initial value problem is any problem where the value of your desired solution is specified at t=0, the initial time. If your function is matrix valued, the initial value is a matrix, if it is vector valued, the initial value is a vector, if it is real valued, the initial value is a number. If you think everything should be a number, then to you, a vector valued initial value problem is n^2 initial values. One reason for introducing vectors and matrices is to be able to think of n numbers, or a square array of n^2 numbers, as one object. When I was a young student, it ws very challenging to me to have my book just write f(x), which meant sometimes a number and sometimes a vector, and sometimes a matrix. Your book is using B purposely for a matrix and for a vector because a vector is the special case of a column matrix. In theorem. 7.7 he wants you to remember that previously, in 7.5, the same letter B was a matrix, and you should realize this later result is a specialization of the earlier one.

I.e. Theorem. 7.7 is just a special case of theorem 7.5. It is stated separately because it is in this version that many books state the result. This shows you that the usual vector version is a special case of the general matrix version. They could also be stated in the other order, and one could then observe that the more common vector version has a generalization, with essentially the same proof, to a theorem about matrices. Other books only give the result for real valued functions, and then these are two further generalizations of that more elementary result, all with essentially the same proof. There is no doubt an infinite dimensional Banach space version, with the letters A and B representing linear transformations or elements of a Banach space. For this see Lang or Dieudonne'.

mathwonk · Feb 4, 2025

The original theorems 7.5 and 7.7, are almost trivial, as the proof in Apostol shows. I.e. existence, or the fact that e^tA solves u' = Au, is just a matter of knowing how to differentiate the exponential function. the uniqueness is the same argument used in beginning calculus, to show that the only functions with derivative zero are constants. I.e. assuming u' = au, look at h = u/e^at. By the quotient (or product) rule we get h' = [u'e^at - uae^at]/ e^2at = [aue^at-uae^at]/e^2at = 0. Hence h = constant c, so h = c.e^at. and c is determined by the initial condition.

But the question was why they are both stated separately, for which I stick with pedagogy, i.e. learning is facilitated by repetition, in particular it is useful to see how to specialize general statements. The OP is thus to be commended for seeing that the statements are in theory needlessly repetitious.

If you are interested in more insight as to the proofs in general, the usual argument for the existence, is by a sequence of approximations, which also can be used to see uniqueness. I.e. given a (possibly time dependent) vector field f(t,x), (where x is an arbitrary element of some vector space E) and hence differential equation u'(t) = f(t,u(t)) with initial condition u(0) = b, where u is a function from the reals to the vector space E, then u solves the differential equation if and only if it solves the integral equation u(t) = b + integral from 0 to t, of f(s,u(s))ds.

Hence u is a solution if and only if u is a "fixed point" of the operator H taking any function u to the function H(u) = b + integral from 0 to t, of f(s,u(s))ds.

Moreover, with appropriate condition on the domain, and on f, this operator H is a "contraction" (of the space of functions u from some t-interval to E), hence does have a unique fixed point. I.e. the contraction H gradually squeezes the whole space down to that one point, which itself is left fixed, and this fixed point is the solution to both the integral and differential equation. Thus one can begin from any point u0 at all in the function space, and repeated application of the operator H will give a sequence u0, H(u0) = u1, H(u1) = u2,...converging to the fixed point.

In the simple case of f(t,u(t)) = A.u(t), and u(0) = B, we get the operator H taking a function u to the function H(u) = [B + integral from 0 to t, of A.u(s)ds].

Beginning from the constant function u0 = B, and applying the operator H repeatedly, we get the sequence of approximations:
u0 = B,
u1 = H(u0) = H(B) = B + ABt,
u2 = H(u1) = H(B+ABt) = B + ABt + A^2B.t^2/2,
u3 = H(u2) = H(B+ABt+A^2B.t^2/2) = B + ABt + A^2B.t^2/2 + A^3B.t^3/3!,
.......
This is the famous exponential sequence, converging to (e^At).B = (e^tA).B.

Here we can take B in any Banach space E, i.e. E any vector space of any dimension, even infinite, which has a notion of "length" or norm" for which convergence of Cauchy sequences occurs, and A any continuous (i.e. "bounded") linear transformation on E.

Technical note: An interesting subtlety arises here that I got wrong in an earlier comment, (since deleted). Namely in order to guarantee convergence of the sequence {H(uj)} we must let H act on a space of merely continuous functions u from a t-interval to a closed bounded ball in E. I.e. a uniform limit of smooth functions may not be smooth in general. But in this case, even if u is merely assumed continuous, but also a fixed point of H, we have u = H(u). Then since H(u) is an integral of a continuous integrand, H(u) must be smooth by the fundamental theorem of calculus, hence u = H(u) forces the fix point u to also be smooth.
[I have not defined a contraction, but it is an operator H which in particular is continuous, so It follows that if u is the limit of {H(uj)}, then H(u) is also the limit of {H(uj)} hence u = H(u). The definition of contraction moreover forces the sequence {H(uj)} to be Cauchy, hence convergent under our assumptions.

Defn: H is a contraction of the metric space S, (with distance function d), iff H:S-->S and there is a constant c, with 0 < c < 1, such that for every pair of points x,y in S, we have d(H(x),H(y)) ≤ c.d(x,y). e.g this holds if the distance between any pair of points is cut in half by applying H.]

Office_Shredder · Feb 11, 2025

The way I would think about this, is 7.7 in some way implies 7.5, in that 7.7 is the normal initial value problem you are thinking about, and you can cast the 7.5 version as:
Given nxn matrices ##A## (fixed) and ##x## (arbitrary) the map ##x\to Ax## is linear. So there exists some ##n^2\times n^2## matrix ##A'## such that if you treat ##x## as a vector of length ##n^2##, you can rewrite ##Ax## as ##A'x##. Then you can solve the initial value problem ##y=A'x##, ##y(0)=B## with ##y=e^{A't}B##.

That's great but this thing we just computed is an ##n^2\times n^2## matrix. It's not obvious that ##e^{A't}## acting on ##n^2## length vectors is the same as ##e^{At}## acting on ##n\times n## matrices. It turns out it is I guess, but you have to think about how to show it

pasmith · Feb 11, 2025

It's even easier than that. Think about how matrix multiplication is defined.

For \dot x = Ax, x can be any size of matrix for which Ax is defined and Ax is the same size as x, ie. x must be n \times m. The kth column of Ax depends on each row of A but only on the kth column of x, so we are in effect solving m independent initial value problems for an n \times 1 matrix.

For \dot x = xA, x must be m \times n. This time, the kth row of xA depends only on the kth row of x, so we ae solving m independent initial value problems for a 1 \times n matrix.

Is there a typo in this theorem in Apostol or not?

Similar threads

Distance between a Clock's hands when the distance is increasing most rapidly

Volume with spherical coordinates

Use greedy vertex coloring algorithm to prove the upper bound of χ

Does this series converge uniformly?

Conflicting definitions of linear independence

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers