# Lorentz invariance of Klein-Gordon eqn & Maxwell Lagrangian

spaghetti3451

## Homework Statement

1. Show directly that if ##\varphi(x)## satisfies the Klein-Gordon equation, then ##\varphi(\Lambda^{-1}x)## also satisfies this equation for any Lorentz transformation ##\Lambda##.

2. Show that ##\mathcal{L}_{Maxwell}=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}## is invariant under the Lorentz transformation ##x \rightarrow \Lambda x##.

## The Attempt at a Solution

The Klein-Gordon equation is ##\partial^{\mu}\partial_{\mu}\varphi(x) + m^{2}\varphi(x)=0##.

Under a Lorentz tranformation ##x \rightarrow \Lambda x##, the Klein-Gordon equation becomes

##{(\Lambda^{-1})_{\rho}}^{\mu}\partial^{\rho}{(\Lambda^{-1})_{\sigma}}^{\mu}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda^{-1})_{\sigma}}^{\mu}\partial^{\rho}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{\Lambda^{\mu}}_{\sigma}\partial^{\rho}\partial^{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

Am I correct so far?

You've misplaced one of your ##\mu##'s. One must be upstairs, and the other downstairs, as in your original KG eqn.

spaghetti3451
Ok. Let me start again.

The Klein-Gordon equation is ##\partial^{\mu}\partial_{\mu}\varphi(x) + m^{2}\varphi(x)=0##.

Under a Lorentz transformation ##x \rightarrow \Lambda x##, the Klein-Gordon equation becomes

##{(\Lambda^{-1})_{\rho}}^{\mu}\partial^{\rho}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda)_{\mu}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {\eta_{\rho}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies \partial^{\rho}\partial_{\rho}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

Doesn't this mean that the Klein-Gordon is Lorentz invariant?

It looks ok to me.

spaghetti3451
In retrospect, I think I should put brackets around ##\partial^{\rho}\partial_{\sigma}\phi## starting from the fourth line of my previous post.

That's because we have ##(\partial^{\rho}\partial_{\sigma}\phi)(\Lambda^{-1}x)##, i.e. ##\rho## and ##\sigma## ought of be indices of the argument ##\Lambda^{-1}x##, not of the the argument ##x##.

Indeed. I wasn't sure how pedantic you wanted to get. One could rephrase the problem as: show that the equation $$\eta^{\mu\nu} \frac{\partial^2}{\partial x^\mu \, \partial x^\nu} \, \phi(x) ~+~ m^2 \phi(x) ~=~ 0$$remains form-invariant under the change of coordinates ##x^\mu \to y^\mu = \Lambda^\mu_{~\sigma} x^\sigma## ...

Strictly speaking, you'd also have to show that the domain of ##y## for which the equation makes sense coincides with the original domain of ##x##.

Clear277 and spaghetti3451
spaghetti3451
Strictly speaking, you'd also have to show that the domain of ##y## for which the equation makes sense coincides with the original domain of ##x##.

How'd you go about proving that the domains of ##x## and ##y## coincide?

Well, first state the domain of ##x##. All of Minkowski space, right? Then prove that the coordinate mappings specified by any Lorentz transformation are bijective. (This is actually rather easy -- don't over-think it. If you need a simpler warmup, consider ordinary 3-space rotations first, then generalize to Lorentz.)

spaghetti3451
Another way to make the original problem notationally simpler is to note that ##m^2 \phi## is already a Lorentz scalar. Then re-express the 1st term like a linear algebra equation. I.e., think of ##\partial## as a 4-component column vector. The Laplacian operator can then be expressed as $$\Delta ~=~ \partial^T \eta \partial ~,$$where ##\eta## is the 4x4 Minkowski metric matrix. Also note that since ##\Lambda## is independent of ##x##, it can pass through the derivative operators if necessary. In the transformed coordinate system, the new Laplacian, i.e., ##\Delta'## is $$\Delta' ~=~ \partial'^T \eta' \partial' ~=~ (\Lambda\partial)^T (\Lambda \eta \Lambda^{-1}) (\Lambda\partial) ~=~ \partial^T \Lambda^T \Lambda \eta \Lambda^{-1}\Lambda\partial ~=~ \partial^T \eta \partial ~=~ \Delta~,$$since ##\eta## is invariant under Lorentz transformations, and ##\Lambda^T = \Lambda^{-1}##.

[Edit: I just realized I was using the wrong symbol for the Laplacian. Corrected now.]

Last edited:
spaghetti3451
Well, first state the domain of ##x##. All of Minkowski space, right? Then prove that the coordinate mappings specified by any Lorentz transformation are bijective. (This is actually rather easy -- don't over-think it. If you need a simpler warmup, consider ordinary 3-space rotations first, then generalize to Lorentz.)
To show that the coordinate transformation specified by some Lorentz transformation is bijective, we need to prove that the Lorentz tranformation is both injective and surjective.

An injective transformation is such that two distinct elements of its codomain is mapped to from distinct elements of its domain. Given two spacetime points ##x_{1}## and ##x_{2}## in our domain, a particular Lorentz tranformation ##\Lambda## maps the two spacetime points to different spacetime points ##y_{1}## and ##y_{2}## in our codomain.

A surjective transformation is such that every element of the codomain is mapped to from element(s) of the domain. Given a spacetime point ##y## in our codomain, a particular Lorentz tranformation ##\Lambda## maps at least one spcaetime point ##x## in our domain to the spacetime point ##y## in our codomain.

Not really sure if this constitutes a proof, though.

Another way to make the original problem notationally simpler is to note that ##m^2 \phi## is already a Lorentz scalar. Then re-express the 1st term like a linear algebra equation. I.e., think of ##\partial## as a 4-component column vector. The Laplacian operator can then be expressed as $$\nabla ~=~ \partial^T \eta \partial ~,$$where ##\eta## is the 4x4 Minkowski metric matrix. Also note that since ##\Lambda## is independent of ##x##, it can pass through the derivative operators if necessary. In the transformed coordinate system, the new Laplacian, i.e., ##\nabla'## is $$\nabla' ~=~ \partial'^T \eta' \partial' ~=~ (\Lambda\partial)^T (\Lambda \eta \Lambda^{-1}) (\Lambda\partial) ~=~ \partial^T \Lambda^T \Lambda \eta \Lambda^{-1}\Lambda\partial ~=~ \partial^T \eta \partial ~=~ \nabla~,$$since ##\eta## is invariant under Lorentz transformations, and ##\Lambda^T = \Lambda^{-1}##.

This is really nice!

To show that the coordinate transformation specified by some Lorentz transformation is bijective, we need to prove that the Lorentz tranformation is both injective and surjective. [...]
You can probably take a shortcut by appealing to the fact that Lorentz transformations form a group. Then, having shown that any point in Minkowski space is mapped by an arbitrary Lorentz transformation into another point in Minkowski space, you're done -- because every element of a group has an inverse, and that's only possible here if the transformations are indeed bijective. (If it failed the injectivity requirement, a well-defined inverse transformation wouldn't exist.)

This is really nice!
Yes -- it's good to become "bilingual". I.e., able to work with both index-free and index-full notations.

Clear277 and spaghetti3451
spaghetti3451
Let me try question ##2##.

First, let me do it the hard way:

##\mathcal{L}_{\text{Maxwell}}=-\frac{1}{4}F_{\mu\nu}F^{\mu\nu}=-\frac{1}{4}(\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu})(\partial^{\mu}A^{\nu}-\partial^{\nu}A^{\mu})##

##\rightarrow -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)][{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)-{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)^{\mu}}_{\beta}(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)]##

##= -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)^{\mu}}_{\beta}+{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)_{\mu}}^{\sigma}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)##

##= -\frac{1}{4}[{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\nu}{(\Lambda)^{\nu}}_{\beta}-{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)^{\nu}}_{\beta}{(\Lambda^{-1})_{\alpha}}^{\mu}{(\Lambda)_{\mu}}^{\sigma}-{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\beta}{(\Lambda^{-1})_{\alpha}}^{\nu}{(\Lambda)_{\nu}}^{\sigma}+{(\Lambda^{-1})^{\rho}}_{\nu}{(\Lambda)^{\nu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\mu}{(\Lambda)^{\mu}}_{\beta}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)##

##= -\frac{1}{4}[{\eta^{\rho}}_{\alpha}{\eta^{\sigma}}_{\beta}-{\eta^{\rho}}_{\beta}{\eta_{\alpha}}^{\sigma}-{\eta^{\rho}}_{\beta}{\eta_{\alpha}}^{\sigma}+{\eta^{\rho}}_{\alpha}{\eta^{\sigma}}_{\beta}](\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\alpha}A^{\beta})(\Lambda^{-1}x)##

##=-\frac{1}{4}[(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\rho}A^{\sigma})(\Lambda^{-1}x)-(\partial_{\rho}\partial_{\alpha})(\Lambda^{-1}x)(\partial^{\alpha}A^{\rho})(\Lambda^{-1}x)-(\partial_{\rho}\partial_{\alpha})(\Lambda^{-1}x)(\partial^{\alpha}A^{\rho})(\Lambda^{-1}x)+(\partial_{\rho}A_{\sigma})(\Lambda^{-1}x)(\partial^{\rho}A^{\sigma})(\Lambda^{-1}x)]##

##=-\frac{1}{4}[(\partial_{\rho}\partial_{\sigma})(\Lambda^{-1}x)-(\partial_{\sigma}\partial_{\rho})(\Lambda^{-1}x)][(\partial^{\rho}\partial^{\sigma})(\Lambda^{-1}x)-(\partial^{\sigma}\partial^{\rho})(\Lambda^{-1}x)]##

##-\frac{1}{4}F_{\rho\sigma}(\Lambda^{-1}x)F^{\rho\sigma}(\Lambda^{-1}x)##.

Now, the easy way:

##\mathcal{L}_{\text{Maxwell}} = -\frac{1}{4}F_{\mu\nu}(x)F^{\mu\nu}(x)##

##\rightarrow -\frac{1}{4}{(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda)^{\nu}}_{\beta}F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)##

##= -\frac{1}{4}{(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)^{\mu}}_{\alpha}{(\Lambda^{-1})^{\sigma}}_{\nu}{(\Lambda^{-1})^{\nu}}_{\beta}F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)##

##= -\frac{1}{4} {\eta^{\rho}}_{\alpha} {\eta^{\sigma}}_{\beta} F_{\rho\sigma}(\Lambda^{-1}x)F^{\alpha\beta}(\Lambda^{-1}x)##

##= -\frac{1}{4} F_{\rho\sigma}(\Lambda^{-1}x)F^{\rho\sigma}(\Lambda^{-1}x)##.

What do you think?

Well, I guess it's ok to thrash yourself in this masochistic fashion once in a while, if you feel that your sins warrant it.

Personally, I would have simply showed that ##F_{\mu\nu}## does indeed transform as a 2nd-rank tensor under Lorentz transformations (for which a cut-down edit of your first bit would be enough). Then show that any contraction between upper and lower indices of a vector results in a scalar. Then generalize this to a double contraction over the 2 indices in a 2nd-rank tensor (which is also a scalar, hence Lorentz-invariant).

I.e., build up a little toolkit for yourself so that you can eventually just look at a tensorial expression involving contractions and understand what rank tensor the overall expression represents.

spaghetti3451
I did not notice that ##F_{\mu\nu}F^{\mu\nu}## involved a double contraction. Obviously, a (double) contraction results in a scalar.

But, let me try to explictly demonstrate the Lorentz invariance of ##\mathcal{L}_{Maxwell}## using your guidelines anyway.

Firstly, I need to show that ##F_{\mu\nu}## transforms as a ##(0,2)## tensor under Lorentz transformations. To demonstrate this fact, I will use the fact that ##F_{\mu\nu}=\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}## and that ##A_{\mu}## is a vector.

##F_{\mu\nu}(x) = (\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu})(x)##
##\rightarrow {(\Lambda^{-1})^{\rho}}_{\mu}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma}-\partial_{\sigma}A_{\rho})(\Lambda^{-1}x)##
##= {(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}(\partial_{\rho}A_{\sigma}-\partial_{\sigma}A_{\rho})(\Lambda^{-1}x)##
##= {(\Lambda)_{\mu}}^{\rho}{(\Lambda)_{\nu}}^{\sigma}F_{\rho\sigma}(\Lambda^{-1}x)##

so that ##F_{\mu\nu}## transforms as a ##(0,2)## tensor under Lorentz transformations.

Then I need to show that any contraction between the upper index of one vector and the lower index of another vector results in a scalar. And finally, I need to generalize this to a double contraction over the 2 indices in a 2nd-rank tensor.

To show that the contractions lead to a scalar, don't I have to Lorentz transform both ##F^{\mu\nu}(x)## and ##F_{\mu\nu}(x)## and show that the resulting Lorentz transformation tensors cancel out among themselves to give us ##F^{\mu\nu}(\Lambda^{-1}x)## and ##F_{\mu\nu}(\Lambda^{-1}x)##. That's exactly what I've done in the second proof of my previous post without really knowing that I was Lorentz transforming the scalar ##F^{\mu\nu}F_{\mu\nu}##.

So, I guess I can use your guidelines as an informal way to convince myself of the Lorentz invariance of ##\mathcal{L}_{Maxwell}##, and use my second proof of the previous post as a formal way to prove the Lorentz invariance of ##\mathcal{L}_{Maxwell}##.

Firstly, I need to show that ##F_{\mu\nu}## transforms as a ##(0,2)## tensor under Lorentz transformations. To demonstrate this fact, I will use the fact that ##F_{\mu\nu}=\partial_{\mu}A_{\nu}-\partial_{\nu}A_{\mu}## and that ##A_{\mu}## is a vector. [...]
You could shorten this further by just considering ##\partial_{\mu}A_{\nu}## first. Then appeal to the fact that the sum of 2 tensors of the same rank (i.e., with the same sets of free indices) is also a tensor of that rank.

Then I need to show that any contraction between the upper index of one vector and the lower index of another vector results in a scalar. And finally, I need to generalize this to a double contraction over the 2 indices in a 2nd-rank tensor.

To show that the contractions lead to a scalar, don't I have to Lorentz transform both ##F^{\mu\nu}(x)## and ##F_{\mu\nu}(x)## and show that the resulting Lorentz transformation tensors cancel out among themselves to give us ##F^{\mu\nu}(\Lambda^{-1}x)## and ##F_{\mu\nu}(\Lambda^{-1}x)##. That's exactly what I've done in the second proof of my previous post without really knowing that I was Lorentz transforming the scalar ##F^{\mu\nu}F_{\mu\nu}##.

You could shorten this by simply noting that a contraction like ##v^\alpha v_\alpha## can be written similarly to what I did in post #9, i.e., ##v^T \eta v##. For the double contraction on a 2-tensor, you can write it like this: $$F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~.$$Then replace all quantities by their dashed versions, substitute in the Lorentz matrices, and perform manipulations that generalize what I did in post #9. But, tbh, this is probably overkill, because... (see below)...

So, I guess I can use your guidelines as an informal way to convince myself of the Lorentz invariance of ##\mathcal{L}_{Maxwell}##,
Well, it's more than just "informal". When one writes ##v^\alpha w_\alpha##, that's an inner product on vectors in Minkowski space. Lorentz transformations are designed to preserve all such inner products. Indeed, that's one way of defining the Lorentz group. So ##v^\alpha w_\alpha## is automatically a Lorentz scalar by definition.

spaghetti3451
spaghetti3451
You could shorten this by simply noting that a contraction like ##v^\alpha v_\alpha## can be written similarly to what I did in post #9, i.e., ##v^T \eta v##. For the double contraction on a 2-tensor, you can write it like this: $$F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~.$$Then replace all quantities by their dashed versions, substitute in the Lorentz matrices, and perform manipulations that generalize what I did in post #9. But, tbh, this is probably overkill, because... (see below)...

Can you please explain why ## F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~##? I have never seen a double contraction written in index-free notation before, hence the query.

Well, it's more than just "informal". When one writes ##v^\alpha w_\alpha##, that's an inner product on vectors in Minkowski space. Lorentz transformations are designed to preserve all such inner products. Indeed, that's one way of defining the Lorentz group. So ##v^\alpha w_\alpha## is automatically a Lorentz scalar by definition.

Weren't Lorentz transformations originally designed as a consequence of the postulates of special relativity? I always thought that an inner product is a Lorentz scalar due to the Lorentz transformation rules of a ##4##-vector, not the other way round.

Can you please explain why ## F^{\mu\nu} F_{\mu\nu} ~=~ Tr ( F^T \eta F \eta ) ~##? I have never seen a double contraction written in index-free notation before, hence the query.
$$F^{\mu\nu} F_{\mu\nu} = F^{\mu\nu} F^{\alpha\beta} \eta_{\mu\alpha} \eta_{\nu\beta} = F^{\mu\nu} \eta_{\mu\alpha} F^{\alpha\beta} \eta_{\nu\beta} = (F^T)^{\nu\mu} \eta_{\mu\alpha} F^{\alpha\beta} \eta_{\beta\nu}~,$$where, in the last step I've used the definition of "transpose" and the symmetry of the metric ##\eta##. The "Tr" operation is just there to perform the same task in the index-free version as contracting over the last index ##\nu##.

If you're still having trouble with this, try writing out the ordinary product of two 3x3 matrices ##A,B## and the resultant matrix ##C## (i.e., ##AB=C##) in component notation. And do you understand what ##Tr(C)## means in that context?

Weren't Lorentz transformations originally designed as a consequence of the postulates of special relativity?
Lorentz transformations were discovered to be applicable in that context (actually they were discovered earlier than Einstein, but let's not get into that).

I always thought that an inner product is a Lorentz scalar due to the Lorentz transformation rules of a ##4##-vector, not the other way round.
The logic is bidirectional. The modern definition of the Lorentz group is simply the group of matrices that preserve an indefinite inner product of the form ##ds^2 = -dt^2 + dx^2 + dy^2 + dz^2##. Hence the group is often written as ## SO(1,3)##. One can represent elements of the (abstract) Lorentz group as coordinate transformations in Minkowski space but there are also other possible representations. (A "representation" of a group is mapping of the elements of the abstract group to operators on some vector space.)

spaghetti3451
Thanks!

Old Monkey
Ok. Let me start again.

The Klein-Gordon equation is ##\partial^{\mu}\partial_{\mu}\varphi(x) + m^{2}\varphi(x)=0##.

Under a Lorentz transformation ##x \rightarrow \Lambda x##, the Klein-Gordon equation becomes

##{(\Lambda^{-1})_{\rho}}^{\mu}\partial^{\rho}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda^{-1})^{\sigma}}_{\mu}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {(\Lambda^{-1})_{\rho}}^{\mu}{(\Lambda)_{\mu}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies {\eta_{\rho}}^{\sigma}\partial^{\rho}\partial_{\sigma}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

##\implies \partial^{\rho}\partial_{\rho}\varphi(\Lambda^{-1}x)+m^{2}\varphi(\Lambda^{-1}x)=0##

Doesn't this mean that the Klein-Gordon is Lorentz invariant?
Write the Lorentz transformation as
\begin{equation*}
x ^{\prime\mu} = \Lambda^{\mu}{}_{\nu} x^{\nu} .
\end{equation*}
Then
\begin{align*}
\eta_{\mu\nu} \Lambda^{\mu}{}_{\rho} \Lambda^{\nu}{}_{\sigma} &= \eta_{\rho\sigma}, \\
\eta^{\mu\nu} \Lambda^{\rho}{}_{\mu} \Lambda^{\sigma}{}_{\nu} &= \eta^{\rho\sigma}, \\
\Lambda_{\rho}{}^{\mu} \Lambda^{\sigma}{}_{\mu} &= \delta ^{\sigma}_{\rho} .
\end{align*}
Therefore
\begin{align*}
\hat{p}^{\mu}\hat{p}_{\mu}\psi(x) &= - \hbar ^{2} \frac{\partial }{\partial x_{\mu} } \frac{\partial }{\partial x^{\mu} } \psi(x) \\
&= - \hbar^{2} \left( \frac{\partial x ^{\prime} _{\rho} }{\partial x_{\mu} } \frac{\partial }{\partial x ^{\prime} _{\rho} } \right) \left( \frac{\partial x ^{\prime\sigma} }{\partial x^{\mu} } \frac{\partial }{\partial x ^{\prime\sigma} } \right) \psi ^{\prime} (x ^{\prime} ) \\
&= - \hbar ^{2} \Lambda_{\rho} {}^{\mu} \Lambda ^{\sigma}{}_{\mu}\frac{\partial }{\partial x ^{\prime} _{\rho} } \frac{\partial }{\partial x ^{\prime\sigma} } \psi ^{\prime} (x ^{\prime} )\\
&= - \hbar ^{2} \delta^{\sigma} _{\rho} \frac{\partial }{\partial x ^{\prime} _{\rho} } \frac{\partial }{\partial x ^{\prime\sigma} } \psi ^{\prime} (x ^{\prime} )\\
&= - \hbar^{2} \frac{\partial }{\partial x ^{\prime} _{\mu} } \frac{\partial }{\partial x ^{\prime \mu} } \psi ^{\prime} (x ^{\prime} )\\
&= \hat{p}^{\prime \mu}\hat{p}^{\prime} _{\mu}\psi ^{\prime} (x ^{\prime} ).
\end{align*}
Thus the Klein-Gordon equation is Lorentz invariant.

Therefore
\begin{align*}
\hat{p}^{\mu}\hat{p}_{\mu}\psi(x) &= - \hbar ^{2} \frac{\partial }{\partial x_{\mu} } \frac{\partial }{\partial x^{\mu} } \psi(x) \\
&= - \hbar^{2} \left( \frac{\partial x ^{\prime} _{\rho} }{\partial x_{\mu} } \frac{\partial }{\partial x ^{\prime} _{\rho} } \right) \left( \frac{\partial x ^{\prime\sigma} }{\partial x^{\mu} } \frac{\partial }{\partial x ^{\prime\sigma} } \right) \psi ^{\prime} (x ^{\prime} ) \\
&= - \hbar ^{2} \Lambda_{\rho} {}^{\mu} \Lambda ^{\sigma}{}_{\mu}\frac{\partial }{\partial x ^{\prime} _{\rho} } \frac{\partial }{\partial x ^{\prime\sigma} } \psi ^{\prime} (x ^{\prime} )\\
\mbox{[...]}
\end{align*}
I don't think this (necro)post is helpful.

In the 2nd line, it's misleading to place the parentheses like this, since the first derivative acts on everything to its right. But you haven't explained how/why the 2nd Lorentz factor can pass through the 1st derivative.

If one is going to approach the problem this way, then it's probably safer to use the coordinate representation of the Lorentz generators (## x_\mu \partial_\nu - x_\nu \partial_\mu##) and show that it commutes with the Casimir operator ##\eta^{\alpha\beta} \partial_\alpha \partial_\beta##.