eoghan said:
			
		
	
	
		
		
			But I could easily come up with a Lorentz invariant equation that is first order, e.g.
$$
(M^\mu\partial_\mu + m^2)\phi=0
$$
where M is a generic matrix.
Now, something should be wrong with this equation
		
		
	 
How about, almost everything is wrong with that equation: 
1) If, as you say, M^{\mu} is not \partial^{\mu} but a “generic matrix”, then the expression M^{\mu}\partial_{\mu} + m^{2} is 
meaningless because each term has different physical unit. In the natural units, the dimension of the first term is \mbox{cm}^{-1} while the dimension of m^{2} is \mbox{cm}^{-2}. 
2) You said that \phi is a scalar field. In 4-dimensional spacetime, a (real) scalar field can be described 
either by a single function 
or (equivalently) by a 5-component field treating (\phi , \partial_{\mu}\phi ) as independent variables. In the first case (i.e., when \phi is a 1-component field), your “matrices” M^{\mu} must be 1 \times 1 matrices. Thus, you must take M^{\mu} = \partial^{\mu} so that the correct dispersion relation E^{2} = P^{2} + m^{2} holds. In the second (5-component) case, the correct first-order equation (for a real scalar field) looks exactly like Dirac equation \left( i \Gamma^{\mu} \partial_{\mu} - m \right) \Psi (x) = 0 , \  \  \  \  \  \  \  \  \  \  (1) with \Psi = \left( \varphi , \psi_{0} , \psi_{1}, \psi_{2} , \psi_{3} \right)^{T} , and the \Gamma’s are a set of four 5 \times 5 matrices satisfying the Duffin-Kemmer algebra \Gamma^{\mu}\Gamma^{\rho}\Gamma^{\nu} + \Gamma^{\nu}\Gamma^{\rho}\Gamma^{\mu} = \eta^{\mu \rho}\Gamma^{\nu} + \eta^{\nu\rho}\Gamma^{\mu} . \  \  \  \  \  \  (2) Notice that (the Duffin-Kemmer equation) Eq(1) has no m^{2} term in it. Within the representation theory, the Duffin-Kemmer equation has beautiful interpretation. However, since your knowledge in group theory is (unfortunately) poor, bellow I will only show you how to obtain the DK equation (1) from the following Klein-Gordon equation of real (1-component) scalar field \partial^{\mu}\partial_{\mu} \phi (x) + m^{2} \phi (x) = 0 . \  \  \  \  \  \  \  \  \  \  \  \  \  (3) So we want to reduce (3) into a system of five (coupled) first-order equations. To do this we define a scalar field by \varphi (x) = \sqrt{m} \phi (x) , and a 4-vector field by \psi_{\mu}(x) = \frac{1}{\sqrt{m}} \partial_{\mu}\phi (x) . Thus, the KG equation (3) can be replaced by the following equivalent system of first-order equations \partial^{\mu}\psi_{\mu} (x) + m \varphi (x) = 0 ,\partial_{\mu} \varphi (x) - m \psi_{\mu}(x) = 0 . These can easily be rewritten as matrix equation
\begin{pmatrix} - m & - \partial_{0} & \partial_{1} & \partial_{2} & \partial_{3} \\ \partial_{0} & - m & 0 & 0 & 0 \\ \partial_{1} & 0 & - m & 0 & 0 \\ \partial_{2} & 0 & 0 & - m & 0 \\ \partial_{3} & 0 & 0 & 0 & - m \end{pmatrix} \begin{pmatrix} \varphi (x) \\ \psi_{0}(x) \\ \psi_{1}(x) \\ \psi_{2}(x) \\ \psi_{3}(x) \end{pmatrix} = 0 . This is exactly the Duffin-Kemmer equation (1) with the \Gamma’s given by \Gamma^{0} = \begin{pmatrix} 0 & i & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\  0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} , \  \  \  \Gamma^{1} = \begin{pmatrix} 0 & 0 & - i & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\  0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} , \Gamma^{2} = \begin{pmatrix} 0 & 0 & 0 & - i & 0 \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} , \  \  \  \Gamma^{3} = \begin{pmatrix} 0 & 0 & 0 & 0 & - i \\ 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 \\  0 & 0 & 0 & 0 & 0 \\ - i & 0 & 0 & 0 & 0 \end{pmatrix} , and the Duffin-Kemmer field \Psi (x) is given by \Psi (x) = \begin{pmatrix} \varphi (x) \\ \psi_{0}(x) \\ \psi_{1}(x) \\ \psi_{2}(x) \\ \psi_{3}(x) \end{pmatrix} .
Similarly, you can equivalently write the Proca equation \partial_{\mu}F^{\mu\nu} + m^{2}A^{\nu} = 0 , which describes a free massive spin-1 field, as a Duffin-Kemmer equation, with four 10 \times 10 \Gamma's. I leave this as exercise for you.