# Observables and Commutation (newbie questions?)

Gold Member
Some questions. Am I getting this basically right?

What does a "state vector" look like?

It looks like |α> or |β> But more than that... It is a complex vector in Hilbert space?

Now, you get "observables" from state-vectors by performing operators on them. So the state-vector contains all the information, but if you want to know the momentum, you do the momentum operator, if you want to know the spin, you do the spin operator, if you want to know the (mass?) you do the mass operator?

And then there are three spin operators, right? Sz, Sx, Sy, and these are called "incompatible observables" because they do not commute. That is SzSy≠SySz

I'll stop there for now, in case I'm completely on the wrong track...

But what does this symbol, an equals-sign with a dot over it, mean?

$$\doteq$$

Gold Member
Another detail

$$[x_i , x_j] = 0, \; \; \;\; [p_i,p_j]=0, \; \; \; \; [x_i,p_j]=i \hbar \delta _{ij}$$

So I gather that two components of position are "compatible"
two components of momentum are "compatible"
and a component of position in the same direction as a component of position are "incompatible."

The momentum and position here are the observables.

But is there any way to make the commutation operation conceptually significant? Because I cannot come up with any intuitive way to physically conceptualize the commutation operator.

Matterwave
Gold Member
The state-vector can be visualized as an arrow in Hilbert space. There's some subtleties with this picture because the normalization and phase, for example, will not affect the physical observables, so people sometimes say that the states are the equivalence class of rays (set of vectors which differ from each other only by normalization or phase) in the Hilbert space, but this is a little bit technical.

Observables are represented in the quantum theory by Hermitian operators (again, there are some subtleties here involving Hermitian versus Self-Adjoint). The result of making 1 particular measurement of an observable on a particular system will return one of the eigenvalues of the observable. The average (or expectation value) of many measurements taken of an observable made on many identically prepared states is obtained by "sandwiching" the operator inside the bra and ket state, e.g. $\langle H\rangle=\langle\psi|H|\psi\rangle$.

In normal QM, the masses of the particles are usually given.

Incompatible observables means that the operators that represent them do not commute. This also implies that one cannot simultaneously diagonalize both observables with the same similarity transform. This leads to the generalized Heisenberg uncertainty principle.

An equal sign with a dot on top is defined in different ways for different contexts. I'm not sure what it's "standard" definition is. The one I am familiar with is that an equal sign with a dot on it means "this is true in some particular coordinate system, but not necessarily true in all coordinate systems". But I'm taking this from GR, so I'm guessing the one you see means something else.

Gold Member
The state-vector can be visualized as an arrow in Hilbert space. There's some subtleties with this picture because the normalization and phase, for example, will not affect the physical observables, so people sometimes say that the states are the equivalence class of rays (set of vectors which differ from each other only by normalization or phase) in the Hilbert space, but this is a little bit technical.

Observables are represented in the quantum theory by Hermitian operators (again, there are some subtleties here involving Hermitian versus Self-Adjoint). The result of making 1 particular measurement of an observable on a particular system will return one of the eigenvalues of the observable. The average (or expectation value) of many measurements taken of an observable made on many identically prepared states is obtained by "sandwiching" the operator inside the bra and ket state, e.g. $\langle H\rangle=\langle\psi|H|\psi\rangle$.

In normal QM, the masses of the particles are usually given.

Incompatible observables means that the operators that represent them do not commute. This also implies that one cannot simultaneously diagonalize both observables with the same similarity transform. This leads to the generalized Heisenberg uncertainty principle.

An equal sign with a dot on top is defined in different ways for different contexts. I'm not sure what it's "standard" definition is. The one I am familiar with is that an equal sign with a dot on it means "this is true in some particular coordinate system, but not necessarily true in all coordinate systems". But I'm taking this from GR, so I'm guessing the one you see means something else.
It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
$$S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}$$

I guess this would be true assuming you are using some particular coordinate system, but which coordinate system? Space-and-time in the z direction? Space-and-space in the x-and-y direction?

I'm also a bit confused on how is it that a 2x2 matrix is operating on an infinite dimensional vector space?

I've bold-faced the words in your post which are in-part unfamiliar to me. I spent some time this morning trying to review some of the terms that you used.

• Hilbert Space
• Infinite Dimensional Complete Inner-Product Space
(You-tube-video)
• A vector space that "has" an inner-product is an inner-product space.
• What does it mean that a vector space "has" an inner-product space? Does it mean that $\int_{-\infty}^{\infty}f(x)g(x)dx$
and $\int_{-\infty}^{\infty}f(x)g(x)dx$ converge for all values of $f, g, \vec \phi, \vec \varphi$?
• Normalization
• It seems in the past, I remember Normalization being a process where you figured out the constant in front of a function so that the integral, $\int_{-\infty}^{\infty}f(x)f^*(x)dx =1$
In general, the 1 represented a total of, for instance, 1 particle; i.e. the total probability of finding a particle in a space where you know the particle is, is 1.
Wikipedia Article
• The wikipedia article says "because of the normalization condition, wave functions form a projective space rather than an ordinary vector space."
• And you said, " the normalization and phase, for example, will not affect the physical observables
• I think I need some kind of link to put together these seemingly disparate ideas.
• phase
• If I am not mistaken, phase represents the angle associated with a complex number in a phase diagram.
• I have a habit of thinking of a wave function such as $e^{i \omega t}$ as having a real part and an imaginary part, and having more likelihood of being observed when the real part is at a maximum. But if phase has no effect on the physical observable, perhaps this is a wrong idea?
• Hermitian
• "In quantum mechanics their importance lies in the fact that in the Dirac–von Neumann formulation of quantum mechanics, physical observables such as position, momentum, angular momentum and spin are represented by self-adjoint operators on a Hilbert space" [/PLAIN] [Broken]
Wikipedia Article
• eigenvalues
• The eigenvectors of a square matrix are the non-zero vectors that, after being multiplied by the matrix, remain parallel to the original vector. For each eigenvector, the corresponding eigenvalue is the factor by which the eigenvector is scaled when multiplied by the matrix. [/PLAIN] [Broken]
Wikipedia Article
• bra-ket notation
• Wikipedia Article
• It seems to have little to do with under-armor. Sorry Agnes Speaking of which, are you a real person?. (Edit: Oh, your post got deleted; never mind.)
• diagonalize
• In linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P such that P-1AP is a diagonal matrix.[/PLAIN] [Broken]
Wikipedia Article
• similarity transform

For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.

Last edited by a moderator:
Ben Niehoff
Gold Member
It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
$$S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}$$
In this context, I think they mean "defined as". However, the expression written is only true in the "z basis", where we say a spin aligned along +z is "spin up", and a spin aligned along -z is "spin down".

In principle, one could work in some other basis, choosing, for example, +x to mean "spin up" and -x to mean "spin down". Then the Sx operator would have the form written above, and the Sz operator would look different.

I'm also a bit confused on how is it that a 2x2 matrix is operating on an infinite dimensional vector space?
The Hilbert space of a spin-1/2 wavefunction is

$$\mathcal{H} = \mathbb{C}^2 \times L^2(\mathbb{R}^3),$$
where the first factor is finite-dimensional and refers to the two spinor components. The Sz operator acts on the first factor via the matrix above, and acts as the identity on the second factor.

Hilbert Space
• Infinite Dimensional Complete Inner-Product Space
(You-tube-video)
• A vector space that "has" an inner-product is an inner-product space.
• What does it mean that a vector space "has" an inner-product space? Does it mean that $\int_{-\infty}^{\infty}f(x)g(x)dx$
and $\int_{-\infty}^{\infty}f(x)g(x)dx$ converge for all values of $f, g, \vec \phi, \vec \varphi$?
A Hilbert space doesn't have to be infinite-dimensional. There are plenty of finite-dimensional Hilbert spaces used in quantum mechanics. All we mean is that there is an inner product (NOTE: If the Hilbert space is complex, the mathematically correct term is "Hermitian product", not "inner product", but physicists usually do not make the distinction).

When a vector space "has" an inner product, all we mean is that we have defined some inner product for it. The axioms of a vector space do not require it to have an inner product; it is an extra piece of structure we put on top of things.

There are actually three layers of structure:

1. A plain vector space merely obeys the laws of vector algebra (addition, and multiplication by scalars).

2. A Banach space is a vector space where we define a norm. In a Banach space, there is the notion of the "length" of a vector. The norm does not have to come from an inner product.

3. A Hilbert space is a vector space where we define an inner product. Every Hilbert space is a Banach space, because an inner product induces a norm by plugging the same vector into both entries. In a Hilbert space, there is the notion of the "angle" between two vectors; that is essentially what an inner product tells us.

In quantum mechanics, we need the Hilbert space structure because the "angle" between two vectors has a physical meaning: it relates to the probability that a system in state A will be observed in state B.

For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.
How strong is your linear algebra? Quantum mechanics relies heavily on it, and a good sense of linear algeabra will make the bra-ket notation much more transparent.

Gold Member
How strong is your linear algebra? Quantum mechanics relies heavily on it, and a good sense of linear algeabra will make the bra-ket notation much more transparent.
I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

$$X = a_0 + \sigma \cdot a$$

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij

Gold Member
Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

$$X = a_0 + \sigma \cdot a$$

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(σk X)?
b. Obtain a0 and ak in terms of the matrix elements Xij
I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
Other assumptions
(2) Assume that k={1,2,3} is a misprint, and it should be k={1,2}
(3) Assume σ.a represents an outer-product
(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
(5) Assume σk X represents two different matrices. One for each k={1,2}.

With these assumptions, I get

$$X=\begin{pmatrix} \sigma_1 a_1+a_0 &\sigma_1 a_2+a_0 \\ \sigma_2 a_1+a_0 & \sigma_2 a_2+a_0 \end{pmatrix}$$

$$tr[X]=\sigma_1 a_1+\sigma_2 a_2+2 a_0$$

(So my answer to part a would be:
$$tr[\sigma_1 X]=\sigma_1(\sigma_1 a_1+\sigma_2 a_2+2 a_0)$$

$$tr[\sigma_2 X]=\sigma_2(\sigma_1 a_1+\sigma_2 a_2+2 a_0)$$

but I can't see anything relevant or interesting about the answer. Essentially, I finished the problem but didn't learn anything.)

As for part (b) I can get, for instance, $$a_2=\frac{X_{21}-X_{22}}{\sigma_1 - \sigma_2}$$

but I can't get any value just in terms of the matrix elements of X. I have to have the terms of σ, as well.

I also cannot find any immediately obvious method for finding a0. Perhaps this has something to do with whatever I failed to learn from part (a)? Last edited:
Gold Member
1. A plain vector space merely obeys the laws of vector algebra (addition, and multiplication by scalars).

2. A Banach space is a vector space where we define a norm. In a Banach space, there is the notion of the "length" of a vector. The norm does not have to come from an inner product.

3. A Hilbert space is a vector space where we define an inner product. Every Hilbert space is a Banach space, because an inner product induces a norm by plugging the same vector into both entries. In a Hilbert space, there is the notion of the "angle" between two vectors; that is essentially what an inner product tells us.

In quantum mechanics, we need the Hilbert space structure because the "angle" between two vectors has a physical meaning: it relates to the probability that a system in state A will be observed in state B.
This may be worth opening up another thread, but is space-time generally considered to be a Banach (norm) space, or a Hilbert (inner-product) Space?

Fredrik
Staff Emeritus
Gold Member
This may be worth opening up another thread, but is space-time generally considered to be a Banach (norm) space, or a Hilbert (inner-product) Space?
It's neither. Since it doesn't have a norm, it's not a normed space, and therefore not a Banach space (=complete normed space). Since it doesn't have an inner product, it's not an inner product space, and therefore not a Hilbert space (complete inner product space).

If you disagree, look at the definition of "inner product" again. I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
They are almost certainly the Pauli spin matrices.

(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
No, if x is a number and A is a square matrix, x+A is defined as xI+A, where I is the identity matrix.

Last edited:
Fredrik
Staff Emeritus
Gold Member
It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
$$S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}$$
I would guess that it means that what you have on the left is the operator, and what you have on the right is a matrix that corresponds to the operator. They aren't actually equal, but they represent the same thing.

I find myself saying this to a lot of people:
You need to study the relationship between linear operators and matrices. It's explained e.g. in post #3 in this thread. (Ignore the quote and the stuff below it).
They all seem to ignore it, even though it's really easy and absolutely essential to understanding those matrices you have to deal with when you study spin-1/2 particles. Probably the single most important detail in all of linear algebra.

Hilbert Space
Infinite Dimensional Complete Inner-Product Space
Hilbert spaces don't have to be infinite-dimensional.

A vector space that "has" an inner-product is an inner-product space.
What does it mean that a vector space "has" an inner-product space?
It can't have an inner product space. It can have an inner product. To say that a vector space V has an inner product is to say that there's an inner product defined on V. The inner product space is actually the pair (V,inner product), just like the vector space isn't just a set, it's a triple (set,addition operation,scalar multiplication operation).

Normalization
It seems in the past, I remember Normalization being a process where you figured out the constant in front of a function so that the integral,
If x is a vector with norm ##\|x\|##, then ##\frac{x}{\|x\|}## is a vector with norm 1. ##1/\|x\|## is called a normalization constant.

The wikipedia article says "because of the normalization condition, wave functions form a projective space rather than an ordinary vector space."
The idea is that if two vectors x and y are considered equivalent if there's a complex c such that x=cy, then the set of equivalence classes can be mapped bijectively onto the set of straight lines through the origin. No need to worry about this.

...having more likelihood of being observed when the real part is at a maximum. But if phase has no effect on the physical observable, perhaps this is a wrong idea?
Yes, this is wrong.

bra-ket notation

For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.
There is obviously a huge gap in your knowledge that can only be filled by studying a book on linear algebra. I like Axler.

Last edited:
Gold Member
It's neither. Since it doesn't have a norm, it's not a normed space, and therefore not a Banach space (=complete normed space). Since it doesn't have an inner product, it's not an inner product space, and therefore not a Hilbert space (complete inner product space).

If you disagree, look at the definition of "inner product" again. As luck would have it, I was able to download chapter 6 of the book you referenced, and it defines an "inner product" on V is a function that takes each ordered pair (u,v) of elements of V to a number <u,v> in F and has the following properties:

positivity, definiteness, additivity in first slot, homogeneity in first slot, and conjugate interchange.

The operator I was thinking of for the space-time norm is
√(Δt^2 - Δx^2 - Δy^2 -Δz^2)

I suppose it does not meet the positivity requirement because it returns complex values.

It does not meet definiteness, because a speed-of-light interval will produce a zero norm.

So this invariant
$$\sqrt{\Delta t^2 - \Delta x^2 -\Delta y^2 -\Delta z^2}$$

is not a norm, but a ____________________________.

They are almost certainly the Pauli spin matrices.
I wonder why they would use Pauli Spin matrix in the prooblem set in chapter 1, when they don't officially introduce it until chapter 3. However, that would definitely make the problem more possible.

No, if x is a number and A is a square matrix, x+A is defined as xI+A, where I is the identity matrix.
Is that standard convention in Quantum Mechanics? I was guessing based on what Mathematica did (but admittedly, I was surprised that adding a constant to a matrix gave any value at all! I fully expected a domain error.)

Ben Niehoff
Gold Member
I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

$$X = a_0 + \sigma \cdot a$$

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij
The $\sigma^i$ are the Pauli spin matrices. And whenever you see a number being added to a matrix, you should think of an implicit identity matrix next to the number. By the way, in addition to tr(X) and tr(\sigma_k X), you should also calculate det(X). The result is something neat.

So this invariant
$$\sqrt{\Delta t^2 - \Delta x^2 -\Delta y^2 -\Delta z^2}$$

is not a norm, but a ____________________________.
Remove the square root sign, and what you have is called a "quadratic form".

Is that standard convention in Quantum Mechanics? I was guessing based on what Mathematica did (but admittedly, I was surprised that adding a constant to a matrix gave any value at all! I fully expected a domain error.)
Yes, it would get very annoying having to write in all the identity operators. Any time a number is added to an operator, we assume the number is actually multiplying the identity operator.

Fredrik
Staff Emeritus
Gold Member
The thing that best corresponds to an inner product on Minkowski spacetime is the map ##g:\mathbb R^4\times\mathbb R^4\to\mathbb R## defined by
$$g(x,y)=x^T\eta y$$ for all ##x,y\in\mathbb R^4##. The y in that formula should be written as a 4×1 matrix. xT is the transpose of x, so it's 1×4. ##\eta## is defined by
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix}.$$ Such a map is called a bilinear form. "Form" because it takes several vectors to a scalar. "bilinear" because it's linear in both variables. An inner product on a real vector space is a bilinear form that satisfies a few additional requirements. This g satisfies all of those except ##g(x,x)\geq 0## for all x.

An inner product on a complex vector space is not a bilinear form, because it's linear in one of the variables and conjugate linear in the other. So it's called a sesquilinear form instead. Math books always take it to be linear in the first variable. Physics books always take to be linear in the second variable. I like the physicists' convention much better. It's much more compatible with bra-ket notation for starters.

g is often called the metric tensor (even by me), but I'm not so sure I like that term when it's defined on spacetime itself, instead of its tangent spaces.

Gold Member
I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

$$X = a_0 + \sigma \cdot a$$

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij
I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
Other assumptions
(2) Assume that k={1,2,3} is a misprint, and it should be k={1,2}
(3) Assume σ.a represents an outer-product
(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
(5) Assume σk X represents two different matrices. One for each k={1,2}.
Assumption FAIL! New assumptions:
(1) σ1, σ2, σ3 are all the Pauli Spin Matrices:
$$\left \{ \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} , \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} , \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \right \}$$
(2) The single equation $X = a_0 + \sigma \cdot a$ represents not one, but three equations.

$$X = (a_0 \times I) + \sigma_k \cdot (a_k \times I)$$
where $k \in \{1,2,3\}$ and I represents the 2x2 identity matrix.

tr(X_k)&=2a_0 \\
tr(\sigma_k \cdot X_k) &= 2 a_k \\
det(X_1) &= a_0^2-a_1^2\\
det(X_2) &=a_0^2+a_2^2 \\
det(X_3) &=a_3^2-a_0^2

Last edited:
Ben Niehoff
Gold Member
(2) The single equation $X = a_0 + \sigma \cdot a$ represents not one, but three equations.
Not quite. X is a 2x2 matrix. The second term is shorthand for $\sigma^k a_k$. We often think of $\vec \sigma$ as a matrix-valued vector.

Also, I'm beginning to think you haven't tried looking these things up before coming here to ask about it.

Gold Member
The $\sigma^i$ are the Pauli spin matrices. And whenever you see a number being added to a matrix, you should think of an implicit identity matrix next to the number. By the way, in addition to tr(X) and tr(\sigma_k X), you should also calculate det(X). The result is something neat.

Remove the square root sign, and what you have is called a "quadratic form".
Yes, it would get very annoying having to write in all the identity operators. Any time a number is added to an operator, we assume the number is actually multiplying the identity operator.
I think the goal of a text-book writer should be to get the knowledge into the heads of the readers as effectively as possible. In that case, it's the prerogative to explain notational conventions. (Point to be made, and I had not really noticed this before; Sakurai died in 1982; the book I'm using was first printed in 1994. All due respect to those who put together this work. "The editor apologizes in advance if, as is all too probably, there are glaring omissions.")

The thing that best corresponds to an inner product on Minkowski spacetime is the map ##g:\mathbb R^4\times\mathbb R^4\to\mathbb R## defined by
$$g(x,y)=x^T\eta y$$ for all ##x,y\in\mathbb R^4##. The y in that formula should be written as a 4×1 matrix. xT is the transpose of x, so it's 1×4. ##\eta## is defined by
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix}.$$ Such a map is called a bilinear form. "Form" because it takes several vectors to a scalar. "bilinear" because it's linear in both variables. An inner product on a real vector space is a bilinear form that satisfies a few additional requirements. This g satisfies all of those except ##g(x,x)\geq 0## for all x.

An inner product on a complex vector space is not a bilinear form, because it's linear in one of the variables and conjugate linear in the other. So it's called a sesquilinear form instead. Math books always take it to be linear in the first variable. Physics books always take to be linear in the second variable. I like the physicists' convention much better. It's much more compatible with bra-ket notation for starters.

g is often called the metric tensor (even by me), but I'm not so sure I like that term when it's defined on spacetime itself, instead of its tangent spaces.
Are you essentially in agreement about these ideas, or arguing? $$g(\vec v_1,\vec v_2)=\begin{pmatrix} \Delta t_1 &\Delta x_1 &\Delta y_1 &\Delta z_1 \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 &0 &1 &0 \\ 0 &0 &0 &1 \end{pmatrix} \begin{pmatrix} \Delta t_2 \\\Delta x_2 \\\Delta y_2 \\\Delta z_2 \end{pmatrix}$$

$$\left \| \vec v \right \|=\begin{pmatrix} \Delta t &\Delta x &\Delta y &\Delta z \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 &0 &1 &0 \\ 0 &0 &0 &1 \end{pmatrix} \begin{pmatrix} \Delta t \\\Delta x \\\Delta y \\\Delta z \end{pmatrix}$$

Gold Member
Not quite. X is a 2x2 matrix. The second term is shorthand for $\sigma^k a_k$. We often think of $\vec \sigma$ as a matrix-valued vector.
So you disagree with the answers in the "Spoiler" section of my post #14?

Last edited:
Gold Member
Also, I'm beginning to think you haven't tried looking these things up before coming here to ask about it.
You cannot "look up" notational conventions. There's nothing to look up. Either the author explains their conventions, or they don't.

Also why do you have this attitude toward me, that if I should ask questions about quantum mechanics, I must not have tried to figure it out on my own? Isn't quantum mechanics "hard?"

Last edited:
Gold Member
Not quite. X is a 2x2 matrix. The second term is shorthand for $\sigma^k a_k$. We often think of $\vec \sigma$ as a matrix-valued vector.
I just want to clarify what you are saying here. If σ is a matrix-valued vector, it would look something like this?

$$\vec \sigma \equiv \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} \\ \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} \\ \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix}$$

Ben Niehoff
Gold Member
Yes, although in practice one doesn't usually write it out like that.

Gold Member
Yes, although in practice one doesn't usually write it out like that.
Ah, but how much time and frustration could be saved by would-be students of quantum-mechanics if the practice were changed? \begin{align*} X &=a_0 + \sigma \cdot a \\ &= a_0 + \sigma^k a_k\\ &= \begin{pmatrix} a_0 & 0\\ 0 & a_0 \end{pmatrix} + \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} & \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} & \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix} \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} \\ &= \begin{pmatrix} a_0+a_3 & a_2-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix} \end{align*}

Do I have it right, now?

Ben Niehoff
Gold Member
\begin{align*} X &=a_0 + \sigma \cdot a \\ &= a_0 + \sigma^k a_k\\ &= \begin{pmatrix} a_0 & 0\\ 0 & a_0 \end{pmatrix} + \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} & \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} & \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix} \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} \\ &= \begin{pmatrix} a_0+a_3 & a_2-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix} \end{align*}

Do I have it right, now?
Yes.

Gold Member
Yes.
Awesome! Thanks.

Fredrik
Staff Emeritus
Gold Member
Are you essentially in agreement about these ideas, or arguing? ...
I'm not familiar with the term "quadratic form", but I looked it up, and it seems to be correct to use it here. If my quick look at Wikipedia didn't give me wrong idea, the function that takes the 8 components of u and v to ##u^T\eta v## is a quadratic form. I don't know if it would also be correct to use that term for the (bilinear) map that takes (u,v) to ##u^T\eta v##.

New assumptions:
....
(2) The single equation $X = a_0 + \sigma \cdot a$ represents not one, but three equations.
Ah, but how much time and frustration could be saved by would-be students of quantum-mechanics if the practice were changed? In this case, I think it was reasonable to expect the students to see the dot, and recall the definition of the dot product. But if I had been in his shoes, I would have explained it, just in case.

By the way, there's an interesting way of looking at this problem. Every traceless complex self-adjoint 2×2 matrix can be written as
$$\begin{pmatrix}x_3 & x_1-ix_2\\ x_1+ix_2 & -x_3\end{pmatrix}=\sum_{k=1}^3 x_k\sigma_k.$$ Since the Pauli matrices are linearly independent, this means that they are a basis for the vector space of all such matrices. This is a 3-dimensional real vector space, not complex, because if x is a self-adjoint matrix, then ix is not. The vector space of all (not just the traceless) complex self-adjoint 2×2 matrices is 4-dimensional, so it needs one more basis vector, the identity matrix.

Ben Niehoff
$$B(u,v) \equiv \frac12 \Big( Q(u + v) - Q(u) - Q(v) \Big)$$
So it is ok to use the terms more or less interchangeably. I originally said "quadratic form", because JDoolin gave us the spacetime interval, which is a quadratic form acting on the vector $(\Delta t, \Delta x, \Delta y, \Delta z)$.