# Intermediate Math Challenge - June 2018

• Challenge
• Featured
member 587159
Could you do that?

I edited my post. The solution is now implicitely given. I hope that I didn't make any mistakes as I have to admit I was too lazy to check my steps.

I also hope an implicit solution is sufficient, and that my solution is simplified enough.

I like Serena
Homework Helper
Correct. It also works for other rings of the form ##\mathbb{Z}[\sqrt{-p}]##. But it is not the example in c).

I should have written ##\mathbb Q[\sqrt{-3}]##, which are the Eisenstein Integers, instead of ##\mathbb Z[\sqrt{-3}]##.
Then it is the example of (c) isn't it?

Correct ... in a way, but you could have been a bit less sloppy here.

True. I made a mistake there - I'm not used to a and b being divided by 2 and wrote as if they weren't.

I know, that it is no big deal and implicitly covered by what you actually wrote, but I think we should be less sloppy here, since sloppiness must be earned.

Consider me on the road of trying to earn my sloppiness.

fresh_42
Mentor
I should have written ##\mathbb Q[\sqrt{-3}]##, which are the Eisenstein Integers, instead of ##\mathbb Z[\sqrt{-3}]##.
Then it is the example of (c) isn't it?
No. The example in c) is all numbers ##\frac{1}{2}(a+b\sqrt{-3})## with an even sum ##a+b## which can also be odd+odd, and the only denominator is ##2##. O.k. you caught me being sloppy, as I didn't mention ##a,b \in \mathbb{Z}##. I thought this was clear by demanding ##a+b## being even and the entire thing being a ring.
##\mathbb{Q}[\sqrt{-3}]## is a field and all elements apart from zero are units.
True. I made a mistake there - I'm not used to a and b being divided by 2 and wrote as if they weren't.
Yes, it's a bit of an unusual construction, which I found charming because of it.
Consider me on the road of trying to earn my sloppiness.
You're welcome. By reading your proofs, I thought that many things seem to be obvious to you where others need to think a bit. That's good and promising, but don't forget the less gifted among us who are still learning. I have meanwhile the impression that algebraic structures alone are considered difficult on PF even if they are not. Rings, groups and algebras are apparently not high ranked on physicists' schedule.

Last edited:
Gold Member
I edited my post. The solution is now implicitely given. I hope that I didn't make any mistakes as I have to admit I was too lazy to check my steps.

I also hope an implicit solution is sufficient, and that my solution is simplified enough.

Your solution is basically correct but there are a few missing things. So, I'll give my solution too, in order to demonstrate these points.

We can do the transformation ##2x - 4y + 6 = u##, ##x + y - 3 = v##. Differentiating
##2dx - 4dy = du##, ##dx + dy = dv##.
From these we have ##dx = \frac{4dv + du}{6}##, ##dy = \frac{2dv - du}{6}##.
So, the initial differential equation which we are asked to solve becomes ##u\frac{4dv + du}{6} + v\frac{2dv - du}{6} = 0## or ##(u - v)du + (2v + 4u)dv = 0##. This last equation is homogeneous and can be written as ##(\frac{u}{v} - 1)du + (4\frac{u}{v} + 2)dv = 0##. Now, we utilize the transformation ##\frac{u}{v} = w## or ##u = wv## so ##du = wdv + vdw##. So, our last equation above becomes (after doing some math),

##(w - 1)vdw = -(w +1)(w + 2)dv## or ##\frac{w - 1}{(w + 1)(w + 2)}dw = - \frac{dv}{v} + c_1## (##w \neq -1##,##w \neq -2##).
So, ##\int_{}^{} \frac{w - 1}{(w + 1)(w + 2)}dw = - \int_{}^{} \frac{dv}{v} + c_1##. We calculate the integrals and we have

##\ln{\lvert \frac{(w + 2)^3}{(w + 1)^2}\rvert} = \ln{\lvert \frac{1}{c_2 v} \rvert}## ##\space\space\space## (##c_1 = -\ln{c_2}##) or
##\lvert c_2 v(w + 2)^3 \rvert = (w + 1)^2##. Substituting back for ##w## we have

##\lvert c_2 v(\frac{u}{v} + 2)^3 \rvert = (\frac{u}{v} + 1)^2## or
##\lvert c_2 (u + 2v)^3 \rvert = (u + v)^2## so
##\lvert \frac{8}{9}c_2(2x - y)^3 \rvert = (x - y + 1)^2 ## ##(1)##
If ##w = -1## or ##w = -2## we have ##u = -v##, ##u = -2v## respectively so ##x - y + 1 = 0##, ##2x - y = 0## ##(2)##.
##(1)## and ##(2)## are the set of solutions of the initial differential equation we are asked to solve.

Last edited:
• member 587159
StoneTemplePython
Gold Member
Since the month is up and problem 1 has been solved by @tnich who also inquired about my solution, I've dropped it in below. The main idea is to interpret the matrix in terms of a directed graph and build a basis off an easily interpreted walk. That's it.

It looks a lot longer than that because there's a bit of book-keeping at the end.

solve the easier problem:

##\mathbf X:= \mathbf A + \mathbf I##

We can view this directly as a change of variables for the minimal polynomial

## \mathbf A^{m} + \binom{m-1}{1}\mathbf A^{m-1} + \binom{m-1}{2}\mathbf A^{m-2} + \binom{m-1}{3}\mathbf A^{m-3} +...+ \mathbf A = \Big(\big(\mathbf A + \mathbf I\big) - \mathbf I \Big)\Big(\mathbf A +\mathbf I\Big)^{m-1} = \Big(\mathbf X - \mathbf I \Big)\Big(\mathbf X\Big)^{m-1}##

## = \Big(\mathbf X\Big)^{m-1} \Big(\mathbf X - \mathbf I \Big) ##

The rest of this problem ignores ##\mathbf A## and talks in terms of ##\mathbf X##

So, we focus on

##\mathbf X = \left[\begin{matrix}
1- p_{} & 1 - p_{}& 1 - p_{} & \dots &1 - p_{} &1 - p_{} & 1 - p_{}
\\p_{} & 0&0 &\dots & 0 & 0 & 0
\\0 & p_{}&0 &\dots & 0 & 0 & 0
\\0 & 0& p_{}&\dots & 0 & 0 & 0
\\0 & 0&0 & \ddots & 0&0 &0
\\0 & 0&0 & \dots & p_{}&0 &0
\\0 & 0&0 & \dots &0 & p_{} & p_{}
\end{matrix}\right]##

notice that ##\mathbf X## is the matrix associated with a (column stochastic) Markov chain.
- - - -

Plan of Attack:

Step 1:
Easiest thing first: a quick scan of the diagonal tells us that ##\text{trace}\big(\mathbf X\big) = (1-p) + p = 1##. The second step will confirm that the only possible eigenvalues of ##\mathbf X## are zeros and ones hence the trace tells us that the algebraic multiplicity of eigenvalue = 1, is one, and the algebraic multiplicity of eigenvalue 0 is ##m -1##.

Step 2:
show ##\mathbf X## is annihilated by a polynomial consisting only of roots of zeros and ones -- hence those are the only possible eigenvalues of ##\mathbf X##. In particular we are interested in the polynomial

##\mathbf X^m - \mathbf X^{m-1} = \mathbf X^{m-1}\big(\mathbf X - \mathbf I\big) = \mathbf 0##

with a slight refinement at the end we show that the above annihilating polynomial must in fact be the minimal polynomial.

- - - - -
Proof of the annihilating polynomial

note that ##\mathbf X## is a (column stochastic) matrix for a Markov chain. The idea here is to make use of the graph structure and a well chosen basis.

We seek to prove here that

##\mathbf X^m - \mathbf X^{m-1} = \mathbf X^{m-1}\big(\mathbf X - \mathbf I\big) = \big(\mathbf X -0\mathbf I\big)^{m-1}\big(\mathbf X - \mathbf I\big) = \mathbf 0##

or equivalently that
##\mathbf X^m = \mathbf X^{m-1}##

We proceed to do this one vector at a time by showing
## \mathbf s_k^T \mathbf X^m = \mathbf s_k^T \mathbf X^{m-1}##
for ##m## well chosen row vectors.

first, consider
##\mathbf s_m := \mathbf 1##

the ones vector is a left eigenvector of ##\mathbf X##, with eigenvalue of 1, which gives us

##\mathbf 1^T= \mathbf 1^T \mathbf X = \mathbf 1^T \mathbf X^2 = \mathbf 1^T \mathbf X^3 = ... = \mathbf 1^T \mathbf X^{m-1} = \mathbf 1^T \mathbf X^{m}##

This is so easy to work with, it suggests that we may be able to build an entire proof by using it as a base case of sorts.

The key insight is to then view everything in terms of the underlying directed graph. Let's consider reversing the transition diagram associated with the markov chain, and ignore re-normalization that would typically be done when trying to reverse a markov chain. (Equivalently, we're just transposing our matrix and no longer calling it a markov chain.)
- - - -
edit: the right way to state this is to say we are not using our matrix as a transition matrix but instead as an expectations operator (which is naturally given by the transpose).
- - - -

for illustrative purposes, consider the ##m :=7## case. Literally interpreted our backwards walk has a transition diagram that looks like this: (note for graphics formatting, ##1-p## was not rendering properly, so ##q:= 1-p## has been used in the above)

But there is some special structure about dealing with the ones vector (equivalently, being in a uniform un-normalized distribution). And looking at the above diagram we can see that node 7 has rather different behavior than the others, so let's ignore it for the moment.

With just a small bit of insight, we can re-interpret the interesting parts of the state diagram as If this is not clear to people, I'm happy to discuss more. If one 'gets' this state diagram, and recognizes that the above states can form a basis, the argument that follows for how long it takes to 'wipe out' the embedded nilpotent matrix becomes stunningly obvious. The below symbol manipulation is needed to make the argument complete, but really the above picture is the argument for the minimal polynomial.

- - - - -
Some bookkeeping to finish it off:

With these ideas in mind, now consider standard basis vectors ##\mathbf e_i##

##\mathbf s_i := \mathbf e_i## for ##i \in \{1, 2, ..., m-1\}##

the end goal is to evaluate each vector in

##\mathbf S^T =
\bigg[\begin{array}{c|c|c|c|c}
\mathbf e_1 & \mathbf e_2 &\cdots & \mathbf e_{m-1} & \mathbf 1
\end{array}\bigg]^T##

(recall: we have already done it for the ones vector)

This ##\text{m x m}## matrix is triangular with ones on the diagonal and hence invertible. After iterating through the following argument, we'll have

##\mathbf S^T \mathbf X^{m} = \mathbf S^T \mathbf X^{m-1}##
or
##\mathbf X^{m} = \big(\mathbf S^T\big)^{-1} \mathbf S^T \mathbf X^{m} = \big(\mathbf S^T\big)^{-1}\mathbf S^T \mathbf X^{m-1} = \mathbf X^{m-1}##

the remaining steps thus are to confirm the relation holds for ##\mathbf e_i##

for ##i=1## we have

## \mathbf e_1^T \mathbf X = (1-p) \mathbf 1^T##

and

## \mathbf e_1^T \mathbf X^2 = (1-p) \mathbf 1^T\mathbf X = (1-p) \mathbf 1^T##

hence

##\mathbf e_1^T \mathbf X = \mathbf e_1^T \mathbf X^2##

multiplying each side by ##\mathbf X^{m-2}## gives the desired result.

To chain on the end result, by following the graph, for ## 2 \leq r \leq m-1##

##\mathbf e_{r}^T \mathbf X = (p) \mathbf e_{r-1}^T##

hence
##\mathbf e_{r}^T \mathbf X^2 = (p)^2 \mathbf e_{r-2}^T##
##\vdots##
##\mathbf e_{r}^T \mathbf X^{r-1} = (p)^{r-1} \mathbf e_{1}^T##
##\mathbf e_{r}^T \mathbf X^r = (p)^{r-1}(1-p) \mathbf 1^T##

and

##\mathbf e_{r}^T \mathbf X^{r +1} = \big(\mathbf e_{r}^T \mathbf X^r\big) \mathbf X = \big((p)^{r-1}(1-p) \mathbf 1^T\big) \mathbf X = (p)^{r-1}(1-p) \big(\mathbf 1^T \mathbf X\big) = \big((p)^{r-1}(1-p) \mathbf 1^T\big) = \mathbf e_{r}^T \mathbf X^r ##

if ##r = m-1## we have the desired equality.
- - - -
now to consider the ##r \lt m-1## case,
we right mutliply each side by ##\mathbf X^{(m-1) -r}## and get

## \mathbf e_{r}^T \mathbf X^{m}= \big(\mathbf e_{r}^T \mathbf X^{r+1}\big)\mathbf X^{(m-1) -r} = \big(\mathbf e_{r}^T \mathbf X^{r}\big)\mathbf X^{(m-1) -r} = \mathbf e_{r}^T \mathbf X^{m-1}##

collecting all these relationships gives us

##\begin{bmatrix}
\mathbf e_1^T \\
\mathbf e_2^T \\
\vdots\\
\mathbf e_{m-1}^T \\
\mathbf 1^T
\end{bmatrix}\mathbf X^m = \begin{bmatrix}
\mathbf e_1^T \\
\mathbf e_2^T \\
\vdots\\
\mathbf e_{m-1}^T \\
\mathbf 1^T
\end{bmatrix}\mathbf X^{m-1}##

which proves the stated annihilating polynomial.

Combined with our knowledge of the trace (and Cayley Hamilton), we know that the below is the characteristic polynomial of ##\mathbf X##

##p\big(\mathbf X\big) = \mathbf X^m - \mathbf X^{m-1} = \mathbf X^{m-1}\big(\mathbf X - \mathbf I\big) = \mathbf 0##

- - - -
A slight refinement: it is worth remarking here that there are some additional insights to be gained from the ##r = m-1## case. In particular we can see the imprint of the minimal polynomial as this ##r = m-1## case takes longest for the implicit walk on the graph to 'get to' the uniform state.

That is (and again, considering the picture of the graph is highly instructive here), if we consider the case of

##\mathbf e_{r}^T \mathbf X^{r-1} = (p)^{r-1} \mathbf e_{1}^T ##

and set ##r := m-1##, then we have

##\mathbf e_{m-1}^T \mathbf X^{m-2} = (p)^{m-2} \mathbf e_{1}^T \neq (p)^{m-2}(1-p) \mathbf 1^T = \mathbf e_{m-1}^T \mathbf X^{m-1}##

we have thus found
##\mathbf e_{m-1}^T \mathbf X^{m-2}\neq \mathbf e_{m-1}^T \mathbf X^{m-1}##

which means in general

## \mathbf X^{m-2} \neq \mathbf X^{m-1}##

(i.e. if the above were an equality it would have to hold for all vectors including ##\mathbf e_{m-1}^T##)

This also means that ## \mathbf X^{m-2}\big(\mathbf X - \mathbf I\big)= \mathbf X^{m-1} - \mathbf X^{m-2} \neq \mathbf 0## which rules out a minimal polynomial of degree ##m-1##, which also means that an even lower degree polynomial cannot annhiliate ##\mathbf X##. This point alone, confirms that the degree of the minimal polynomial must match that of the characteristic polynomial, which completes the problem.

#### Attachments

Last edited:
fresh_42
Mentor
Solution to #6.

##6.## Consider ##\mathfrak{su}(3)=\operatorname{span}\{\,T_3,Y,T_{\pm},U_{\pm},V_{\pm}\,\}## given by the basis elements
##\space## ##\space## ##\space## ##\space## ##\space## ##\space## ##\space## ##\space## \begin{align*} T_3&=\frac{1}{2}\lambda_3\; , \;Y=\frac{1}{\sqrt{3}}\lambda_8\; ,\\ T_{\pm}&=\frac{1}{2}(\lambda_1\pm i\lambda-2)\; , \;U_{\pm}=\frac{1}{2}(\lambda_6\pm i\lambda_7)\; , \;V_{\pm}=\frac{1}{2}(\lambda_4\pm i\lambda_5) \end{align*}

(cp. https://www.physicsforums.com/insights/representations-precision-important) where the ##\lambda_i## are the Gell-Mann matrices and its maximal solvable Borel-subalgebra ##\mathfrak{B}:=\langle T_3,Y,T_+,U_+,V_+ \rangle##

Now ##\mathfrak{A(B)}=\{\,\alpha: \mathfrak{g} \to \mathfrak{g}\, : \,[X,\alpha(Y)]=[Y,\alpha(X)]\,\,\forall \,X,Y\in \mathfrak{B}\,\}## is the one-dimensional Lie algebra spanned by ##\operatorname{ad}(V_+)## because ##\mathbb{C}V_+## is a one-dimensional ideal in ##\mathfrak{B}## (Proof?). Then ##\mathfrak{g}:=\mathfrak{B}\ltimes \mathfrak{A(B)}## is again a Lie algebra by the multiplication ##[X,\alpha]=[\operatorname{ad}X,\alpha]## for all ##X\in \mathfrak{B}\; , \;\alpha \in \mathfrak{A(B)}##. (For a proof see problem 9 in https://www.physicsforums.com/threads/intermediate-math-challenge-may-2018.946386/ )

a) Determine the center of ##\mathfrak{g}## , and whether ##\mathfrak{g}## is semisimple, solvable, nilpotent or neither.
b) Show that ##(X,Y) \mapsto \alpha([X,Y])## defines another Lie algebra structure on ##\mathfrak{B}## , which one?
c) Show that ##\mathfrak{A(g)}## is at least two-dimensional.

We have the following multiplication table
\begin{align*}
[T_3,Y]&=[T_+,Y]=[T_+,V_+]=[U_+,V_+]=0\\
[T_3,T_+]&=T_+\; , \;[T_3,U_+]=-\frac{1}{2}U_+\; , \;[T_3,V_+]=\frac{1}{2}V_+ \\
[U_0+,T_+]&=-V_+\; , \;[Y,U_+]=U_+\; , \;[Y,V_+]=V_+ \\
\end{align*}
With ##\mathfrak{A(B)}=\mathbb{C}\cdot \alpha\; , \;\alpha(Z)=\operatorname{ad}V_+(Z)=[V_+,Z]## we get
$$[X,\alpha]=X.\alpha=[\operatorname{ad}X,\operatorname{ad}V_+]=\operatorname{ad}[X,V_+] \sim \operatorname{ad}V_+ \sim \alpha$$
and ##\operatorname{span}\{\,T_+,U_+,V_+\,\} \subseteq \operatorname{ker}\alpha = \operatorname{ker}\operatorname{ad}V_+=\mathfrak{C}_\mathfrak{B}(V_+)##, so
\begin{align*}
\mathfrak{g}^{(0)}&=\mathfrak{g}=\mathfrak{B}\oplus \mathfrak{A(B)}\\
\mathfrak{g}^{(1)}&=[\mathfrak{g},\mathfrak{g}]=[\mathfrak{B},\mathfrak{B}]\oplus \mathfrak{A(B)}= \langle T_+,U_+,V_+\rangle \oplus \mathfrak{A(B)}\\
\mathfrak{g}^{(2)}&=[\mathfrak{g}^{(1)},\mathfrak{g}^{(1)}]=\mathbb{C}V_+ \oplus \{\,0\,\}\\
\mathfrak{g}^{(3)}&=[\mathfrak{g}^{(2)},\mathfrak{g}^{(2)}]= \{\,0\,\}
\end{align*}
Therefore ##\mathfrak{g}=\mathfrak{B}\ltimes \mathfrak{A(B)}## is solvable, and not semisimple. If we take a central element ##Z=aT_3+bY+cT_++dU_++eV_++f\alpha \in \mathfrak{Z(g)}## and solve successively
$$[Z,U_+]=0 \to [Z,V_+]=0 \to [Z,Y]=0$$
then we get all coefficients have to be zero, i.e. ##\mathfrak{Z(g)}=\{\,0\,\}## and ##\mathfrak{g}## cannot be nilpotent. It also shows, that ##\alpha([X,Y])=0## is an Abelian structure on ##\mathfrak{B}##.\\[6pt]
For a one-dimensional ideal ##\mathfrak{I}=\langle V_0 \rangle## of any Lie algebra ##\mathfrak{h}## we have ##[X,V_0]=\mu(X)V_0## for all ##X\in \mathfrak{h}## and some linear form ##\mu \in \mathfrak{h}^*##. With ##\alpha(X):=\operatorname{ad}(V_0)(X)=-\mu(X)V_0## we always get a non-trivial antisymmetric transformation of ##\mathfrak{h}##. Therefore ##\beta_1(B+b\alpha):=-\mu(X)V_+## defines a non-trivial antisymmetric transformation of ##\mathfrak{g}=\mathfrak{B}\ltimes \mathfrak{A(B)}##, since ##\mathfrak{I}=\mathbb{C}\cdot V_+ \triangleleft \mathfrak{g}## is a one-dimensional ideal. However, ##\mathbb{C}\cdot \alpha = \mathfrak{A(B)}## is also a one-dimensional ideal of ##\mathfrak{g}##, so ##\beta_2(B+b\alpha):=\mu(X)\alpha## is antisymmetric, too, and linear independent of ##\beta_1##. Thus
$$\dim \mathfrak{A(g)}=\mathfrak{A}(\mathfrak{B}\ltimes \mathfrak{A(B)}) \ge 2$$

Last edited: