You are here

Those Physicists And Their "Physics Proofs"

There are plenty of cases where a proof written down by a physicist is worse than a proof written down by a mathematician, but this is a particularly bad one. In one of my courses, we got to derive the Dirac matrices, which are instrumental in describing spin 1/2 particles. These four matrices are written as $ \gamma $ with an index. One definition of them says that they should satisfy the anti-commutation relations of the Clifford algebra:

\[<br />
\left \{ \gamma^{\mu}, \gamma^{\nu} \right \} \equiv \gamma^{\mu}\gamma^{\nu} + \gamma^{\nu}\gamma^{\mu} = 2 \eta^{\mu\nu} I<br />
 \]

where $ \eta $ is the Minkowski metric from special relativity.

\[<br />
\eta = \left [<br />
\begin{tabular}{cccc}<br />
1 & 0 & 0 & 0 \\<br />
0 & -1 & 0 & 0 \\<br />
0 & 0 & -1 & 0 \\<br />
0 & 0 & 0 & -1<br />
\end{tabular}<br />
\right ]<br />
 \]

How big do our matrices have to be in order to satisfy this? They obviously cannot be 1x1 matrices because these are just numbers that commute. It turns out that they have to be at least 4x4 but all published sources I have seen fail at explaining why. I will go through the physics proof that is often given and then set the record straight by writing a real proof. If it appears nowhere else, let it appear here!

Incomplete proof

I will depart from the convention of calling the matrices $ \gamma^0, \gamma^1, \gamma^2 $ and $ \gamma^3 $. For some reason I like $ \gamma^t, \gamma^x, \gamma^y $ and $ \gamma^z $ better. The relations above basically say that:

\[<br />
\left ( \gamma^t \right )^2 = I, \;\;\; \left ( \gamma^i \right )^2 = -I<br />
 \]

and distinct Dirac matrices anti-commute. If we look at the equation $ \gamma^{\mu}\gamma^{\nu} = - \gamma^{\nu}\gamma^{\mu} $ and take the determinant of both sides, we get: $ \left ( \textup{Det}\gamma^{\mu} \right ) \left ( \textup{Det} \gamma^{\nu} \right ) = (-1)^n \left ( \textup{Det}\gamma^{\nu} \right ) \left ( \textup{Det} \gamma^{\mu} \right ) $. If something is equal to $ (-1)^n $ times itself, $ n $ must be even. This rules out 3x3 Dirac matrices and the question becomes why can't we represent the Clifford algebra with 2x2 matrices?. Most physics textbooks seem to be okay with this part of the proof.

Some people say that the largest possible set of anti-commuting 2x2 matrices has only three elements. Is this supposed to be easy to show? Is the maximal anti-commuting set known for matrices of any size? There is a paper about that from 1932. It is 11 pages and only proves the 4x4 case so I highly doubt it. Anyway, here is how other sources proceed to "prove" this result:

We know that the three Pauli matrices anti-commute so let three of our Dirac matrices be Pauli matrices. Also, if we take the three Pauli matrices and adjoin the identity, we get a basis for the vector space of 2x2 matrices. Therefore our fourth Dirac matrix must be expressed as:

\[<br />
M = a_t I + a_x \sigma_x + a_y \sigma_y + a_z \sigma_z<br />
 \]

Since the Pauli matrices anti-commute, the product of distinct Pauli matrices will be traceless (as is a single Pauli matrix). The trace of a squared Pauli matrix is 2. Therefore by linearity, $ \textup{Tr} (M \sigma_p) = 2a_p $. However, $ M $ also has to anti-commute with $ \sigma_p $ meaning that $ M \sigma_p $ should be traceless. This forces $ a_x, a_y $ and $ a_z $ to all be zero meaning $ M $ is the identity. The identity commutes with every matrix so the fourth matrix we set out to find doesn't exist.

This works if you restrict yourself to a ridiculously special case but who ever said that three of the four anti-commuting matrices should be Pauli matrices? Maybe if you start off with a different set of three anti-commuting matrices there suddenly will be room for a fourth. The proof above would only be complete if it cited some theorem that this never happens. Since I am not aware of such a theorem, I will split our search into two cases and show that in each case we can only find three matrices with the desired properties, not four.

Complete proof

Notice that the equations defining our Dirac matrices are invariant under similarity transformations. If $ V $ is an invertible matrix,

\begin{align*}<br />
\left \{ V \gamma^{\mu} V^{-1}, V \gamma^{\nu} V^{-1} \right \} &= V \gamma^{\mu}\gamma^{\nu} V^{-1} + V \gamma^{\nu}\gamma^{\mu} V^{-1} \\<br />
&= V \left \{ \gamma^{\mu}, \gamma^{\nu} \right \} V^{-1} \\<br />
&= 2 V \eta^{\mu\nu} I V^{-1} \\<br />
&= 2 \eta^{\mu\nu} I<br />
 \end{align*}

so without loss of generality, we can assume that $ \gamma^t $ is in Jordan canonical form. Case 1: assume that $ \gamma^t $ is diagonalizable. You get the identity by squaring $ \gamma^t $ so the diagonal entries in it can only be $ \pm 1 $. If both diagonal entries had the same sign, we would be left with a matrix that commutes with everything. Therefore in this case, $ \gamma^t = \sigma_z $. Denote the components of $ \gamma^x $ by $ a, b, c $ and $ d $. The fact that $ \gamma^t $ anti-commutes with $ \gamma^x $ says that:

\begin{align*}<br />
\left [<br />
\begin{tabular}{cc}<br />
1 & 0 \\<br />
0 & -1<br />
\end{tabular}<br />
\right ] \left [<br />
\begin{tabular}{cc}<br />
a & b \\<br />
c & d<br />
\end{tabular}<br />
\right ] &= - \left [<br />
\begin{tabular}{cc}<br />
a & b \\<br />
c & d<br />
\end{tabular}<br />
\right ] \left [<br />
\begin{tabular}{cc}<br />
1 & 0 \\<br />
0 & -1<br />
\end{tabular}<br />
\right ] \\<br />
\left [<br />
\begin{tabular}{cc}<br />
a & b \\<br />
-c & -d<br />
\end{tabular}<br />
\right ] &= \left [<br />
\begin{tabular}{cc}<br />
a & -b \\<br />
c & -d<br />
\end{tabular}<br />
\right ]<br />
 \end{align*}

In other words, $ \gamma^t $ being diagonal forces the spatial Dirac matrices to be anti-diagonal. Now we will let $ \gamma^i $ have the entry $ b_i $ in the upper right and $ c_i $ in the lower left. If these matrices all anti-commute, the equations that this gives us are:

\begin{align*}<br />
b_x c_y = - c_x b_y \\<br />
b_y c_z = - c_y b_z \\<br />
b_x c_z = - c_x b_z \\<br />
b_x c_x = b_y c_y = b_z c_z = -1<br />
 \end{align*}

where the last equation comes from squaring the spatial Dirac matrices. Multiply the first equation by $ c_x $. This gives $ c_y = c_x^2 b_y $. If we substitute this into the second equation we get $ c_z = -c_x^2 b_z $. Now substituting this into the third equation, $ b_x c_x $ comes out to $ 1 $, contradicting the last equation. Therefore starting with a diagonal $ \gamma^t $ leads to a contradiction.

Case 2: if $ \gamma^t $ is not diagonalizable, its Jordan form is:

\[<br />
\left [<br />
\begin{tabular}{cc}<br />
\lambda & 1 \\<br />
0 & \lambda<br />
\end{tabular}<br />
\right ]<br />
 \]

However, the square of this matrix has $ 2 \lambda $ in an off-diagonal entry. We know $ \left ( \gamma^t \right )^2 $ is diagonal so it is necessary to have $ \lambda = 0 $ but this is not sufficient to give us $ \left ( \gamma^t \right )^2 = I $. The square of the above matrix with $ \lambda = 0 $ will be the zero matrix, not the identity. Therefore this 2x2 Dirac matrix assumption is a contradiction too and we must use matrices that are 4x4 or larger.

This completes the proof without automatically assuming that everything is a Pauli matrix. It is worth noting however that the Dirac matrices can be expressed quite nicely in terms of the Pauli matrices. It is easy to check that the following expressions satisfy the Clifford algebra relations:

\[<br />
\gamma^t = \left [<br />
\begin{tabular}{cc}<br />
0 & I \\<br />
I & 0<br />
\end{tabular}<br />
\right ], \;\;\; \gamma^i = \left [<br />
\begin{tabular}{cc}<br />
0 & \sigma_i \\<br />
\sigma_i & 0<br />
\end{tabular}<br />
\right ]<br />
 \]

This close resemblance explains why I want to treat the Dirac and Pauli matrices symmetrically. I learned about the Pauli matrices first which were subscripted using x, y, z instead of 1, 2, 3. This is why I reject the idea of using numbers instead of letters on the Dirac matrices. I also refuse to call them "gamma matrices" because no one ever used "sigma matrix" to refer to a Pauli matrix. No I will not dare to compare myself to Dirac even though he used his own "symmetry conventions" to decide on terminology. Oh wait, I just did.