Observables and Commutation (newbie questions?)

In summary, the conversation discusses the concept of state vectors in quantum mechanics, which can be represented as complex vectors in Hilbert space. These state vectors contain all the information about a system, and observable quantities can be obtained by performing operators on them. However, certain observables, such as momentum and spin, are incompatible and cannot be simultaneously diagonalized with the same similarity transform. This leads to the generalized Heisenberg uncertainty principle. The symbol with an equals sign and a dot on top has different definitions, but in this context, it likely means that the equation is true in a specific coordinate system. The conversation also touches on the concept of normalization in wave functions, which ensures that the total probability of finding a particle in a given space is
  • #1
JDoolin
Gold Member
723
9
Some questions. Am I getting this basically right?

What does a "state vector" look like?

It looks like |α> or |β> But more than that... It is a complex vector in Hilbert space?

Now, you get "observables" from state-vectors by performing operators on them. So the state-vector contains all the information, but if you want to know the momentum, you do the momentum operator, if you want to know the spin, you do the spin operator, if you want to know the (mass?) you do the mass operator?

And then there are three spin operators, right? Sz, Sx, Sy, and these are called "incompatible observables" because they do not commute. That is SzSy≠SySz

I'll stop there for now, in case I'm completely on the wrong track...

But what does this symbol, an equals-sign with a dot over it, mean?


[tex]\doteq[/tex]
 
Physics news on Phys.org
  • #2
Another detail[tex][x_i , x_j] = 0, \; \; \;\; [p_i,p_j]=0, \; \; \; \; [x_i,p_j]=i \hbar \delta _{ij}[/tex]So I gather that two components of position are "compatible"
two components of momentum are "compatible"
and a component of position in the same direction as a component of position are "incompatible."

The momentum and position here are the observables.

But is there any way to make the commutation operation conceptually significant? Because I cannot come up with any intuitive way to physically conceptualize the commutation operator.
 
  • #3
The state-vector can be visualized as an arrow in Hilbert space. There's some subtleties with this picture because the normalization and phase, for example, will not affect the physical observables, so people sometimes say that the states are the equivalence class of rays (set of vectors which differ from each other only by normalization or phase) in the Hilbert space, but this is a little bit technical.

Observables are represented in the quantum theory by Hermitian operators (again, there are some subtleties here involving Hermitian versus Self-Adjoint). The result of making 1 particular measurement of an observable on a particular system will return one of the eigenvalues of the observable. The average (or expectation value) of many measurements taken of an observable made on many identically prepared states is obtained by "sandwiching" the operator inside the bra and ket state, e.g. [itex]\langle H\rangle=\langle\psi|H|\psi\rangle[/itex].

In normal QM, the masses of the particles are usually given.

Incompatible observables means that the operators that represent them do not commute. This also implies that one cannot simultaneously diagonalize both observables with the same similarity transform. This leads to the generalized Heisenberg uncertainty principle.

An equal sign with a dot on top is defined in different ways for different contexts. I'm not sure what it's "standard" definition is. The one I am familiar with is that an equal sign with a dot on it means "this is true in some particular coordinate system, but not necessarily true in all coordinate systems". But I'm taking this from GR, so I'm guessing the one you see means something else.
 
  • #4
Matterwave said:
The state-vector can be visualized as an arrow in Hilbert space. There's some subtleties with this picture because the normalization and phase, for example, will not affect the physical observables, so people sometimes say that the states are the equivalence class of rays (set of vectors which differ from each other only by normalization or phase) in the Hilbert space, but this is a little bit technical.

Observables are represented in the quantum theory by Hermitian operators (again, there are some subtleties here involving Hermitian versus Self-Adjoint). The result of making 1 particular measurement of an observable on a particular system will return one of the eigenvalues of the observable. The average (or expectation value) of many measurements taken of an observable made on many identically prepared states is obtained by "sandwiching" the operator inside the bra and ket state, e.g. [itex]\langle H\rangle=\langle\psi|H|\psi\rangle[/itex].

In normal QM, the masses of the particles are usually given.

Incompatible observables means that the operators that represent them do not commute. This also implies that one cannot simultaneously diagonalize both observables with the same similarity transform. This leads to the generalized Heisenberg uncertainty principle.

An equal sign with a dot on top is defined in different ways for different contexts. I'm not sure what it's "standard" definition is. The one I am familiar with is that an equal sign with a dot on it means "this is true in some particular coordinate system, but not necessarily true in all coordinate systems". But I'm taking this from GR, so I'm guessing the one you see means something else.

It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
[tex]S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}[/tex]

I guess this would be true assuming you are using some particular coordinate system, but which coordinate system? Space-and-time in the z direction? Space-and-space in the x-and-y direction?

I'm also a bit confused on how is it that a 2x2 matrix is operating on an infinite dimensional vector space?

I've bold-faced the words in your post which are in-part unfamiliar to me. I spent some time this morning trying to review some of the terms that you used.

  • Hilbert Space
    • Infinite Dimensional Complete Inner-Product Space
      (You-tube-video)
    • A vector space that "has" an inner-product is an inner-product space.
    • What does it mean that a vector space "has" an inner-product space? Does it mean that [itex]\int_{-\infty}^{\infty}f(x)g(x)dx[/itex]
      and [itex]\int_{-\infty}^{\infty}f(x)g(x)dx[/itex] converge for all values of [itex]f, g, \vec \phi, \vec \varphi[/itex]?
  • Normalization
    • It seems in the past, I remember Normalization being a process where you figured out the constant in front of a function so that the integral, [itex]\int_{-\infty}^{\infty}f(x)f^*(x)dx =1[/itex]
      In general, the 1 represented a total of, for instance, 1 particle; i.e. the total probability of finding a particle in a space where you know the particle is, is 1.
      Wikipedia Article
    • The wikipedia article says "because of the normalization condition, wave functions form a projective space rather than an ordinary vector space."
    • And you said, " the normalization and phase, for example, will not affect the physical observables
    • I think I need some kind of link to put together these seemingly disparate ideas.
  • phase
    • If I am not mistaken, phase represents the angle associated with a complex number in a phase diagram.
    • I have a habit of thinking of a wave function such as [itex]e^{i \omega t}[/itex] as having a real part and an imaginary part, and having more likelihood of being observed when the real part is at a maximum. But if phase has no effect on the physical observable, perhaps this is a wrong idea?
  • Hermitian
  • Self-Adjoint
    • "In quantum mechanics their importance lies in the fact that in the Dirac–von Neumann formulation of quantum mechanics, physical observables such as position, momentum, angular momentum and spin are represented by self-adjoint operators on a Hilbert space" [/PLAIN] [Broken]
      Wikipedia Article
  • eigenvalues
    • The eigenvectors of a square matrix are the non-zero vectors that, after being multiplied by the matrix, remain parallel to the original vector. For each eigenvector, the corresponding eigenvalue is the factor by which the eigenvector is scaled when multiplied by the matrix. [/PLAIN] [Broken]
      Wikipedia Article
  • bra-ket notation
    • Wikipedia Article
    • It seems to have little to do with under-armor. Sorry Agnes :smile: Speaking of which, are you a real person?. (Edit: Oh, your post got deleted; never mind.)
  • diagonalize
    • In linear algebra, a square matrix A is called diagonalizable if it is similar to a diagonal matrix, i.e., if there exists an invertible matrix P such that P-1AP is a diagonal matrix.[/PLAIN] [Broken]
      Wikipedia Article
  • similarity transform

For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.
 
Last edited by a moderator:
  • #5
JDoolin said:
It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
[tex]S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}[/tex]

In this context, I think they mean "defined as". However, the expression written is only true in the "z basis", where we say a spin aligned along +z is "spin up", and a spin aligned along -z is "spin down".

In principle, one could work in some other basis, choosing, for example, +x to mean "spin up" and -x to mean "spin down". Then the Sx operator would have the form written above, and the Sz operator would look different.

I'm also a bit confused on how is it that a 2x2 matrix is operating on an infinite dimensional vector space?

The Hilbert space of a spin-1/2 wavefunction is

[tex]\mathcal{H} = \mathbb{C}^2 \times L^2(\mathbb{R}^3),[/tex]
where the first factor is finite-dimensional and refers to the two spinor components. The Sz operator acts on the first factor via the matrix above, and acts as the identity on the second factor.

Hilbert Space
  • Infinite Dimensional Complete Inner-Product Space
    (You-tube-video)
  • A vector space that "has" an inner-product is an inner-product space.
  • What does it mean that a vector space "has" an inner-product space? Does it mean that [itex]\int_{-\infty}^{\infty}f(x)g(x)dx[/itex]
    and [itex]\int_{-\infty}^{\infty}f(x)g(x)dx[/itex] converge for all values of [itex]f, g, \vec \phi, \vec \varphi[/itex]?

A Hilbert space doesn't have to be infinite-dimensional. There are plenty of finite-dimensional Hilbert spaces used in quantum mechanics. All we mean is that there is an inner product (NOTE: If the Hilbert space is complex, the mathematically correct term is "Hermitian product", not "inner product", but physicists usually do not make the distinction).

When a vector space "has" an inner product, all we mean is that we have defined some inner product for it. The axioms of a vector space do not require it to have an inner product; it is an extra piece of structure we put on top of things.

There are actually three layers of structure:

1. A plain vector space merely obeys the laws of vector algebra (addition, and multiplication by scalars).

2. A Banach space is a vector space where we define a norm. In a Banach space, there is the notion of the "length" of a vector. The norm does not have to come from an inner product.

3. A Hilbert space is a vector space where we define an inner product. Every Hilbert space is a Banach space, because an inner product induces a norm by plugging the same vector into both entries. In a Hilbert space, there is the notion of the "angle" between two vectors; that is essentially what an inner product tells us.

In quantum mechanics, we need the Hilbert space structure because the "angle" between two vectors has a physical meaning: it relates to the probability that a system in state A will be observed in state B.

For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.

How strong is your linear algebra? Quantum mechanics relies heavily on it, and a good sense of linear algeabra will make the bra-ket notation much more transparent.
 
  • #6
Ben Niehoff said:
How strong is your linear algebra? Quantum mechanics relies heavily on it, and a good sense of linear algeabra will make the bra-ket notation much more transparent.

I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

[tex]X = a_0 + \sigma \cdot a[/tex]

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij
 
  • #7
JDoolin said:
Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

[tex]X = a_0 + \sigma \cdot a[/tex]

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(σk X)?
b. Obtain a0 and ak in terms of the matrix elements Xij

I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
Other assumptions
(2) Assume that k={1,2,3} is a misprint, and it should be k={1,2}
(3) Assume σ.a represents an outer-product
(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
(5) Assume σk X represents two different matrices. One for each k={1,2}.

With these assumptions, I get

[tex]X=\begin{pmatrix} \sigma_1 a_1+a_0 &\sigma_1 a_2+a_0 \\ \sigma_2 a_1+a_0 & \sigma_2 a_2+a_0 \end{pmatrix}[/tex]

[tex]tr[X]=\sigma_1 a_1+\sigma_2 a_2+2 a_0[/tex]

(So my answer to part a would be:
[tex]tr[\sigma_1 X]=\sigma_1(\sigma_1 a_1+\sigma_2 a_2+2 a_0)[/tex]

[tex]tr[\sigma_2 X]=\sigma_2(\sigma_1 a_1+\sigma_2 a_2+2 a_0)[/tex]

but I can't see anything relevant or interesting about the answer. Essentially, I finished the problem but didn't learn anything.)

As for part (b) I can get, for instance, [tex]a_2=\frac{X_{21}-X_{22}}{\sigma_1 - \sigma_2}[/tex]

but I can't get any value just in terms of the matrix elements of X. I have to have the terms of σ, as well.

I also cannot find any immediately obvious method for finding a0. Perhaps this has something to do with whatever I failed to learn from part (a)? :smile:
 
Last edited:
  • #8
Ben Niehoff said:
1. A plain vector space merely obeys the laws of vector algebra (addition, and multiplication by scalars).

2. A Banach space is a vector space where we define a norm. In a Banach space, there is the notion of the "length" of a vector. The norm does not have to come from an inner product.

3. A Hilbert space is a vector space where we define an inner product. Every Hilbert space is a Banach space, because an inner product induces a norm by plugging the same vector into both entries. In a Hilbert space, there is the notion of the "angle" between two vectors; that is essentially what an inner product tells us.

In quantum mechanics, we need the Hilbert space structure because the "angle" between two vectors has a physical meaning: it relates to the probability that a system in state A will be observed in state B.

This may be worth opening up another thread, but is space-time generally considered to be a Banach (norm) space, or a Hilbert (inner-product) Space?
 
  • #9
JDoolin said:
This may be worth opening up another thread, but is space-time generally considered to be a Banach (norm) space, or a Hilbert (inner-product) Space?
It's neither. Since it doesn't have a norm, it's not a normed space, and therefore not a Banach space (=complete normed space). Since it doesn't have an inner product, it's not an inner product space, and therefore not a Hilbert space (complete inner product space).

If you disagree, look at the definition of "inner product" again. :smile:

JDoolin said:
I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
They are almost certainly the Pauli spin matrices.

JDoolin said:
(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
No, if x is a number and A is a square matrix, x+A is defined as xI+A, where I is the identity matrix.
 
Last edited:
  • #10
JDoolin said:
It's a quantum mechanics book, but I suspect it means the same thing here. Here is an equation I see the \doteq used in:
[tex]S_z \doteq \frac{\hbar}{2}\begin{pmatrix} 1 &0 \\ 0 & -1 \end{pmatrix}[/tex]
I would guess that it means that what you have on the left is the operator, and what you have on the right is a matrix that corresponds to the operator. They aren't actually equal, but they represent the same thing.

I find myself saying this to a lot of people:
Fredrik said:
You need to study the relationship between linear operators and matrices. It's explained e.g. in post #3 in this thread. (Ignore the quote and the stuff below it).
They all seem to ignore it, even though it's really easy and absolutely essential to understanding those matrices you have to deal with when you study spin-1/2 particles. Probably the single most important detail in all of linear algebra.

JDoolin said:
Hilbert Space
Infinite Dimensional Complete Inner-Product Space
Hilbert spaces don't have to be infinite-dimensional.

JDoolin said:
A vector space that "has" an inner-product is an inner-product space.
What does it mean that a vector space "has" an inner-product space?
It can't have an inner product space. It can have an inner product. To say that a vector space V has an inner product is to say that there's an inner product defined on V. The inner product space is actually the pair (V,inner product), just like the vector space isn't just a set, it's a triple (set,addition operation,scalar multiplication operation).

JDoolin said:
Normalization
It seems in the past, I remember Normalization being a process where you figured out the constant in front of a function so that the integral,
If x is a vector with norm ##\|x\|##, then ##\frac{x}{\|x\|}## is a vector with norm 1. ##1/\|x\|## is called a normalization constant.

JDoolin said:
The wikipedia article says "because of the normalization condition, wave functions form a projective space rather than an ordinary vector space."
The idea is that if two vectors x and y are considered equivalent if there's a complex c such that x=cy, then the set of equivalence classes can be mapped bijectively onto the set of straight lines through the origin. No need to worry about this.

JDoolin said:
...having more likelihood of being observed when the real part is at a maximum. But if phase has no effect on the physical observable, perhaps this is a wrong idea?
Yes, this is wrong.

JDoolin said:
bra-ket notation
See https://www.physicsforums.com/showthread.php?p=2230044#post2230044.

JDoolin said:
For now, I'm just familiarizing myself with the terms used; Everything is a bit of a jumble in my head. So I will come back to this soon. Thank you.
There is obviously a huge gap in your knowledge that can only be filled by studying a book on linear algebra. I like Axler.
 
Last edited:
  • #11
Fredrik said:
It's neither. Since it doesn't have a norm, it's not a normed space, and therefore not a Banach space (=complete normed space). Since it doesn't have an inner product, it's not an inner product space, and therefore not a Hilbert space (complete inner product space).

If you disagree, look at the definition of "inner product" again. :smile:

As luck would have it, I was able to download chapter 6 of the book you referenced, and it defines an "inner product" on V is a function that takes each ordered pair (u,v) of elements of V to a number <u,v> in F and has the following properties:

positivity, definiteness, additivity in first slot, homogeneity in first slot, and conjugate interchange.

The operator I was thinking of for the space-time norm is
√(Δt^2 - Δx^2 - Δy^2 -Δz^2)

I suppose it does not meet the positivity requirement because it returns complex values.

It does not meet definiteness, because a speed-of-light interval will produce a zero norm.

So this invariant
[tex]\sqrt{\Delta t^2 - \Delta x^2 -\Delta y^2 -\Delta z^2}[/tex]

is not a norm, but a ____________________________.

They are almost certainly the Pauli spin matrices.

I wonder why they would use Pauli Spin matrix in the prooblem set in chapter 1, when they don't officially introduce it until chapter 3. However, that would definitely make the problem more possible.

No, if x is a number and A is a square matrix, x+A is defined as xI+A, where I is the identity matrix.

Is that standard convention in Quantum Mechanics? I was guessing based on what Mathematica did (but admittedly, I was surprised that adding a constant to a matrix gave any value at all! I fully expected a domain error.)
 
  • #12
JDoolin said:
I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

[tex]X = a_0 + \sigma \cdot a[/tex]

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij

The [itex]\sigma^i[/itex] are the Pauli spin matrices. And whenever you see a number being added to a matrix, you should think of an implicit identity matrix next to the number. By the way, in addition to tr(X) and tr(\sigma_k X), you should also calculate det(X). The result is something neat.

JDoolin said:
So this invariant
[tex]\sqrt{\Delta t^2 - \Delta x^2 -\Delta y^2 -\Delta z^2}[/tex]

is not a norm, but a ____________________________.

Remove the square root sign, and what you have is called a "quadratic form".

Is that standard convention in Quantum Mechanics? I was guessing based on what Mathematica did (but admittedly, I was surprised that adding a constant to a matrix gave any value at all! I fully expected a domain error.)

Yes, it would get very annoying having to write in all the identity operators. Any time a number is added to an operator, we assume the number is actually multiplying the identity operator.
 
  • #13
The thing that best corresponds to an inner product on Minkowski spacetime is the map ##g:\mathbb R^4\times\mathbb R^4\to\mathbb R## defined by
$$g(x,y)=x^T\eta y$$ for all ##x,y\in\mathbb R^4##. The y in that formula should be written as a 4×1 matrix. xT is the transpose of x, so it's 1×4. ##\eta## is defined by
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix}.$$ Such a map is called a bilinear form. "Form" because it takes several vectors to a scalar. "bilinear" because it's linear in both variables. An inner product on a real vector space is a bilinear form that satisfies a few additional requirements. This g satisfies all of those except ##g(x,x)\geq 0## for all x.

An inner product on a complex vector space is not a bilinear form, because it's linear in one of the variables and conjugate linear in the other. So it's called a sesquilinear form instead. Math books always take it to be linear in the first variable. Physics books always take to be linear in the second variable. I like the physicists' convention much better. It's much more compatible with bra-ket notation for starters.

g is often called the metric tensor (even by me), but I'm not so sure I like that term when it's defined on spacetime itself, instead of its tangent spaces.
 
  • #14
JDoolin said:
I know several concepts from linear algebra, but I don't know much in the way of helpful standard conventions. Here, I'll give an example problem from J.J. Sakurai, Quantum Mechanics that I gives me trouble:

Suppose a 2x2 matrix X (not necessarily Hermitian, nor unitary) is written as

[tex]X = a_0 + \sigma \cdot a[/tex]

where a0 and a1,2,3 are numbers.

a. How are a0 and ak (k=1,2,3) related to tr(X) and tr(\sigma_k X)?
b. Obtain a0 and ak in terms of the matrix elements Xij

JDoolin said:
I can find no reference to any σ in the chapter, so I assume it must be an arbitrary 2 dimensional vector
Other assumptions
(2) Assume that k={1,2,3} is a misprint, and it should be k={1,2}
(3) Assume σ.a represents an outer-product
(4) Assume by "adding a constant to a matrix" it means "add the same constant to every element in the matrix.
(5) Assume σk X represents two different matrices. One for each k={1,2}.

Assumption FAIL! :smile:

New assumptions:
(1) σ1, σ2, σ3 are all the Pauli Spin Matrices:
[tex]\left \{ \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} , \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} , \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \right \}[/tex]
(2) The single equation [itex]X = a_0 + \sigma \cdot a[/itex] represents not one, but three equations.


[tex]X = (a_0 \times I) + \sigma_k \cdot (a_k \times I)[/tex]
where [itex]k \in \{1,2,3\}[/itex] and I represents the 2x2 identity matrix.

tr(X_k)&=2a_0 \\
tr(\sigma_k \cdot X_k) &= 2 a_k \\
det(X_1) &= a_0^2-a_1^2\\
det(X_2) &=a_0^2+a_2^2 \\
det(X_3) &=a_3^2-a_0^2
 
Last edited:
  • #15
JDoolin said:
(2) The single equation [itex]X = a_0 + \sigma \cdot a[/itex] represents not one, but three equations.

Not quite. X is a 2x2 matrix. The second term is shorthand for [itex]\sigma^k a_k[/itex]. We often think of [itex]\vec \sigma[/itex] as a matrix-valued vector.

Also, I'm beginning to think you haven't tried looking these things up before coming here to ask about it.
 
  • #16
Ben Niehoff said:
The [itex]\sigma^i[/itex] are the Pauli spin matrices. And whenever you see a number being added to a matrix, you should think of an implicit identity matrix next to the number. By the way, in addition to tr(X) and tr(\sigma_k X), you should also calculate det(X). The result is something neat.
Remove the square root sign, and what you have is called a "quadratic form".
Yes, it would get very annoying having to write in all the identity operators. Any time a number is added to an operator, we assume the number is actually multiplying the identity operator.

I think the goal of a text-book writer should be to get the knowledge into the heads of the readers as effectively as possible. In that case, it's the prerogative to explain notational conventions. (Point to be made, and I had not really noticed this before; Sakurai died in 1982; the book I'm using was first printed in 1994. All due respect to those who put together this work. "The editor apologizes in advance if, as is all too probably, there are glaring omissions.")

Fredrik said:
The thing that best corresponds to an inner product on Minkowski spacetime is the map ##g:\mathbb R^4\times\mathbb R^4\to\mathbb R## defined by
$$g(x,y)=x^T\eta y$$ for all ##x,y\in\mathbb R^4##. The y in that formula should be written as a 4×1 matrix. xT is the transpose of x, so it's 1×4. ##\eta## is defined by
$$\eta=\begin{pmatrix}-1 & 0 & 0 & 0\\ 0&1&0&0\\ 0&0&1&0\\ 0&0&0&1\end{pmatrix}.$$ Such a map is called a bilinear form. "Form" because it takes several vectors to a scalar. "bilinear" because it's linear in both variables. An inner product on a real vector space is a bilinear form that satisfies a few additional requirements. This g satisfies all of those except ##g(x,x)\geq 0## for all x.

An inner product on a complex vector space is not a bilinear form, because it's linear in one of the variables and conjugate linear in the other. So it's called a sesquilinear form instead. Math books always take it to be linear in the first variable. Physics books always take to be linear in the second variable. I like the physicists' convention much better. It's much more compatible with bra-ket notation for starters.

g is often called the metric tensor (even by me), but I'm not so sure I like that term when it's defined on spacetime itself, instead of its tangent spaces.

Are you essentially in agreement about these ideas, or arguing? :smile:

Fredrik is talking about this thing, and calling it bilinear form:

[tex]g(\vec v_1,\vec v_2)=\begin{pmatrix} \Delta t_1 &\Delta x_1 &\Delta y_1 &\Delta z_1 \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 &0 &1 &0 \\ 0 &0 &0 &1 \end{pmatrix} \begin{pmatrix} \Delta t_2 \\\Delta x_2 \\\Delta y_2 \\\Delta z_2 \end{pmatrix}[/tex]

While Ben Niehoff is talking about this thing, and calling it a quadratic form.
[tex]\left \| \vec v \right \|=\begin{pmatrix} \Delta t &\Delta x &\Delta y &\Delta z \end{pmatrix} \begin{pmatrix} -1 & 0 & 0 & 0\\ 0 & 1 & 0 & 0\\ 0 &0 &1 &0 \\ 0 &0 &0 &1 \end{pmatrix} \begin{pmatrix} \Delta t \\\Delta x \\\Delta y \\\Delta z \end{pmatrix}[/tex]
 
  • #17
Ben Niehoff said:
Not quite. X is a 2x2 matrix. The second term is shorthand for [itex]\sigma^k a_k[/itex]. We often think of [itex]\vec \sigma[/itex] as a matrix-valued vector.

So you disagree with the answers in the "Spoiler" section of my post #14?
 
Last edited:
  • #18
Ben Niehoff said:
Also, I'm beginning to think you haven't tried looking these things up before coming here to ask about it.

You cannot "look up" notational conventions. There's nothing to look up. Either the author explains their conventions, or they don't.

Also why do you have this attitude toward me, that if I should ask questions about quantum mechanics, I must not have tried to figure it out on my own? Isn't quantum mechanics "hard?"
 
Last edited:
  • #19
Ben Niehoff said:
Not quite. X is a 2x2 matrix. The second term is shorthand for [itex]\sigma^k a_k[/itex]. We often think of [itex]\vec \sigma[/itex] as a matrix-valued vector.

I just want to clarify what you are saying here. If σ is a matrix-valued vector, it would look something like this?[tex]\vec \sigma \equiv \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} \\ \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} \\ \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix}[/tex]
 
  • #20
Yes, although in practice one doesn't usually write it out like that.
 
  • #21
Ben Niehoff said:
Yes, although in practice one doesn't usually write it out like that.

Ah, but how much time and frustration could be saved by would-be students of quantum-mechanics if the practice were changed? :smile:[tex]\begin{align*} X &=a_0 + \sigma \cdot a \\ &= a_0 + \sigma^k a_k\\ &= \begin{pmatrix} a_0 & 0\\ 0 & a_0 \end{pmatrix} + \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} & \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} & \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix} \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} \\ &= \begin{pmatrix} a_0+a_3 & a_2-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix} \end{align*}[/tex]

Do I have it right, now?
 
  • #22
JDoolin said:
[tex]\begin{align*} X &=a_0 + \sigma \cdot a \\ &= a_0 + \sigma^k a_k\\ &= \begin{pmatrix} a_0 & 0\\ 0 & a_0 \end{pmatrix} + \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} & \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} & \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix} \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} \\ &= \begin{pmatrix} a_0+a_3 & a_2-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix} \end{align*}[/tex]

Do I have it right, now?

Yes.
 
  • #23
Ben Niehoff said:
Yes.

Awesome! Thanks.
 
  • #24
JDoolin said:
Are you essentially in agreement about these ideas, or arguing? :smile:

Fredrik is talking about this thing, and calling it bilinear form:
...
While Ben Niehoff is talking about this thing, and calling it a quadratic form.
I'm not familiar with the term "quadratic form", but I looked it up, and it seems to be correct to use it here. If my quick look at Wikipedia didn't give me wrong idea, the function that takes the 8 components of u and v to ##u^T\eta v## is a quadratic form. I don't know if it would also be correct to use that term for the (bilinear) map that takes (u,v) to ##u^T\eta v##.

JDoolin said:
New assumptions:
...
(2) The single equation [itex]X = a_0 + \sigma \cdot a[/itex] represents not one, but three equations.
JDoolin said:
Ah, but how much time and frustration could be saved by would-be students of quantum-mechanics if the practice were changed? :smile:
In this case, I think it was reasonable to expect the students to see the dot, and recall the definition of the dot product. But if I had been in his shoes, I would have explained it, just in case.

By the way, there's an interesting way of looking at this problem. Every traceless complex self-adjoint 2×2 matrix can be written as
$$\begin{pmatrix}x_3 & x_1-ix_2\\ x_1+ix_2 & -x_3\end{pmatrix}=\sum_{k=1}^3 x_k\sigma_k.$$ Since the Pauli matrices are linearly independent, this means that they are a basis for the vector space of all such matrices. This is a 3-dimensional real vector space, not complex, because if x is a self-adjoint matrix, then ix is not. The vector space of all (not just the traceless) complex self-adjoint 2×2 matrices is 4-dimensional, so it needs one more basis vector, the identity matrix.
 
  • #25
Fredrik said:
I'm not familiar with the term "quadratic form", but I looked it up, and it seems to be correct to use it here. If my quick look at Wikipedia didn't give me wrong idea, the function that takes the 8 components of u and v to ##u^T\eta v## is a quadratic form. I don't know if it would also be correct to use that term for the (bilinear) map that takes (u,v) to ##u^T\eta v##.

As an object that takes two different inputs, it's a bilinear form B(u,v). To get a quadratic form, you just repeat the same input:

Q(u) = B(u,u)

Any quadratic form gives rise to a bilinear form via polarization:

[tex]B(u,v) \equiv \frac12 \Big( Q(u + v) - Q(u) - Q(v) \Big)[/tex]
So it is ok to use the terms more or less interchangeably. I originally said "quadratic form", because JDoolin gave us the spacetime interval, which is a quadratic form acting on the vector [itex](\Delta t, \Delta x, \Delta y, \Delta z)[/itex].
 
  • #26
Fredrik said:
In this case, I think it was reasonable to expect the students to see the dot, and recall the definition of the dot product. But if I had been in his shoes, I would have explained it, just in case.

You seem to be under the impression that recalling the definition of the dot-product will help you solve this problem. I strongly disagree. I was perfectly aware of the definition of the dot-product, but that did not help me in any way.

You are given:

[tex]X=a_0 + \sigma \cdot a[/tex]

You are told that a0 is a number

You are told that a is a three-by-one vector

You are told that X is a 2x2 matrix

You are not told anything at all about the nature of σ.

The key to this problem is knowing the definition of σ. There's no way to do it if you don't know.
 
  • #27
Well, now that you do understand the statement of the problem, what do you learn from it?
 
  • #28
Ben Niehoff said:
Well, now that you do understand the statement of the problem, what do you learn from it?

:smile: Well, although I don't know if this gives any particular insight, I did come up with these results for part a (and your additional question about the determinant.
Tr[X] = 2 a_0
Tr[σ_1 X] = 2 a_1
Tr[σ_2 X] = 2 a_2
Tr[σ_3 X] = 2 a_3
Det[X]= a_0^2-a_1^2-a_2^2 -a_3^2
(Edit: Determinant modified since Fredrik's correction.)
At this point, I just see a really neat mathematical pattern, but I don't yet have any particular insight as to how that pattern matches anything in physical reality. I hope to gain more insight as I do more problems. (I started #3 in the book, but forgot to finish #2b!)
 
Last edited:
  • #29
JDoolin said:
You seem to be under the impression that recalling the definition of the dot-product will help you solve this problem. I strongly disagree. I was perfectly aware of the definition of the dot-product, but that did not help me in any way.

You are given:

[tex]X=a_0 + \sigma \cdot a[/tex]

You are told that a0 is a number

You are told that a is a three-by-one vector

You are told that X is a 2x2 matrix

You are not told anything at all about the nature of σ.

The key to this problem is knowing the definition of σ. There's no way to do it if you don't know.
My point was that since the dot product for vectors in ##\mathbb R^n## is defined by ##\mathbf{x}\cdot\mathbf{y}=\sum_i\, x_i y_i##, your first guess about what ##\mathbf{\sigma}\cdot\mathbf a## means should (or at least could) have been ##\sum_i\sigma_i a_i##. Of course, if the author hasn't even defined the sigmas, I can see how it would be confusing.

You got the determinant wrong by the way. :smile:
 
Last edited:
  • #30
JDoolin said:
Ah, but how much time and frustration could be saved by would-be students of quantum-mechanics if the practice were changed? :smile:


[tex]\begin{align*} X &=a_0 + \sigma \cdot a \\ &= a_0 + \sigma^k a_k\\ &= \begin{pmatrix} a_0 & 0\\ 0 & a_0 \end{pmatrix} + \begin{pmatrix} \begin{pmatrix} 0 & 1\\ 1 & 0 \end{pmatrix} & \begin{pmatrix} 0 & -i\\ i & 0 \end{pmatrix} & \begin{pmatrix} 1 & 0\\ 0 & -1 \end{pmatrix} \end{pmatrix} \begin{pmatrix} a_1\\ a_2\\ a_3 \end{pmatrix} \\ &= \begin{pmatrix} a_0+a_3 & a_2-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix} \end{align*}[/tex]

Do I have it right, now?

Just a teeny mistake in X12

It should have been:

[tex]X = \begin{pmatrix} a_0+a_3 & a_1-i a_2 \\ a_1 + i a_2 & a_0-a_3 \end{pmatrix}[/tex]

Then for part (b):

[tex]a_0=\frac{X_{11}+X_{22}}{2}[/tex]
[tex]a_1=\frac{X_{12}+X_{21}}{2}[/tex]
[tex]a_2=\frac{X_{12}-X_{21}}{2i}[/tex]
[tex]a_3=\frac{X_{11}-X_{22}}{2}[/tex]
 
  • #31
Fredrik said:
My point was that since the dot product for vectors in ##\mathbb R^n## is defined by ##\mathbf{x}\cdot\mathbf{y}=\sum_i\, x_i y_i##, your first guess about what ##\mathbf{\sigma}\cdot\mathbf a## means should (or at least could) have been ##\sum_i\sigma_i a_i##. Of course, if the author hasn't even defined the sigmas, I can see how it would be confusing.
Ah. Empathy. Thank you.
:)

You got the determinant wrong by the way. :smile:

In fact, I'm glad you told me, because I've been scratching my head since yesterday morning wondering why I got that answer. Now I see I just screwed up a couple of minus-signs.
 
  • #32
No sense in using spoiler tags for this stuff, we already know the answers.

JDoolin said:
Tr[X] = 2 a_0
Tr[σ_1 X] = 2 a_1
Tr[σ_2 X] = 2 a_2
Tr[σ_3 X] = 2 a_3
Det[X]= a_0^2-a_1^2-a_2^2 -a_3^2
(Edit: Determinant modified since Fredrik's correction.)

At this point, I just see a really neat mathematical pattern, but I don't yet have any particular insight as to how that pattern matches anything in physical reality. I hope to gain more insight as I do more problems. (I started #3 in the book, but forgot to finish #2b!)

OK, so if I tell you det(X) = 0, what does that tell you about the 4-vector [itex](a_0, \vec a)[/itex]? Similarly for det(X) < 0, and det(X) > 0.

What would you say is the relationship between the space of 2x2 Hermitian matrices, and 3+1-dimensional Minkowski space?
 
  • #33
Ben Niehoff said:
No sense in using spoiler tags for this stuff, we already know the answers.



OK, so if I tell you det(X) = 0, what does that tell you about the 4-vector [itex](a_0, \vec a)[/itex]? Similarly for det(X) < 0, and det(X) > 0.

If det[X]=0 then
[tex]a_0^2=a_1^2+a_2^2+a_3^2[/tex]

and

If det[X]<0 then
[tex]a_0^2<a_1^2+a_2^2+a_3^2[/tex]

If det[X]>0 then
[tex]a_0^2>a_1^2+a_2^2+a_3^2[/tex]

[tex](a_0, \vec a) = \left ( \pm\sqrt{a_1^2+a_2^2+a_3^3},\vec a \right )[/tex]


What would you say is the relationship between the space of 2x2 Hermitian matrices, and 3+1-dimensional Minkowski space?

Have we generated the complete space of 2x2 Hermitian matrices by defining X? I suppose we have!

A 2x2 Hermitian matrix, in general form, would be written,

[tex]\begin{pmatrix} x & u+vi\\ u-vi & y \end{pmatrix}[/tex]

Now the determinant of this thing is:


[tex]xy+u^2+v^2[/tex]

However, with insight provided by the preceding problem,

Let [tex]\begin{align*} a_0 &=\frac{x+y}{2} \\ a_1 &= u \\ a_2 &= v\\ a_3 &= \frac{y-x}{2} \end{align*}[/tex]

We can further break down the determinant into:

[tex]Det\begin{pmatrix} x & u+iv \\ u-iv & y \end{pmatrix}= -\left ( \frac{x+y}{2} \right )^2+\left ( \frac{y-x}{2} \right )^2+u^2+v^2[/tex]

Now, what would I say is the relationship between the space of 2x2 Hermitian matrices, and 3+1-dimensional Minkowski space? It would be a rather complicated statement to put into words, involving the midpoint and half-the-difference of the real terms of the Hermitian Matrix, and the real and imaginary parts of the complex terms. Once I specified those four terms, if I were to decide to give those four variables names like cΔt, Δx, Δy, Δz, then you would see that the determinant turned out to be identical to the definition of the space-time-interval in Minkowski space-time.

Hmmm, so could we define the variables at the beginning, in such a way that this mathematical identicalness actually means something more than a superficial similarity?
 
  • #34
You're on the right track. The set of complex self-adjoint (=hermitian) 2×2 matrices is a 4-dimensional vector space over ℝ, so it's isomorphic to the vector space ℝ4. ℝ4 is of course also the underlying set of Minkowski spacetime, so any map that takes complex self-adjoint 2×2 matrices to complex self-adjoint 2×2 matrices can be used to define a map from ℝ4 into ℝ4. The maps of the form
$$X\mapsto AXA^\dagger$$ where A is a complex 2×2 matrix with determinant 1 (i.e. A is a member of SL(2,ℂ)) are especially interesting, because they are linear and preserve determinants, i.e. ##\det(AXA^\dagger)=\det X##. This means that they correspond to Lorentz transformations. Note that if you replace A by -A, you get the same map. So there are two members of SL(2,ℂ) for each Lorentz transformation.

This relationship between the Lorentz group SO(3,1) and SL(2,ℂ) is the main part of the reason why SL(2,ℂ) is used instead of SO(3,1) in relativistic QM.

If you had started with complex traceless self-adjoint 2×2 matrices, you could have made essentially the same argument with ℝ3 and SU(2) instead of ℝ4 and SL(2,ℂ).
 
  • #35
JDoolin said:
If det[X]=0 then
[tex]a_0^2=a_1^2+a_2^2+a_3^2[/tex]

and

If det[X]<0 then
[tex]a_0^2<a_1^2+a_2^2+a_3^2[/tex]

If det[X]>0 then
[tex]a_0^2>a_1^2+a_2^2+a_3^2[/tex]

[tex](a_0, \vec a) = \left ( \pm\sqrt{a_1^2+a_2^2+a_3^3},\vec a \right )[/tex]

I was hoping maybe you would interpret those formulas, maybe with words like "timelike", "spacelike", or "lightlike". It helps to step back from the math and think about what you're doing.
 
<h2>1. What are observables in science?</h2><p>Observables refer to measurable quantities or properties of a physical system that can be observed and measured through experiments or observations. These can include things like position, velocity, energy, and spin.</p><h2>2. How are observables related to commutation in science?</h2><p>Commutation refers to the order in which two operations are performed. In science, observables are said to commute if the order in which they are measured does not affect the outcome. This means that the observables can be measured simultaneously without affecting the results.</p><h2>3. What is the significance of commutation in quantum mechanics?</h2><p>In quantum mechanics, commutation is important because it helps determine the fundamental properties of a physical system. If two observables do not commute, it means that they cannot be measured simultaneously with certainty, and this has implications for the uncertainty principle in quantum mechanics.</p><h2>4. Can all observables be measured simultaneously?</h2><p>No, not all observables can be measured simultaneously. This is due to the uncertainty principle in quantum mechanics, which states that the more precisely one observable is measured, the less precisely the other can be measured. Observables that do not commute cannot be measured simultaneously with certainty.</p><h2>5. How do observables and commutation relate to the Heisenberg uncertainty principle?</h2><p>The Heisenberg uncertainty principle states that it is impossible to know the precise values of certain pairs of observables at the same time. This is because these observables do not commute, and therefore, cannot be measured simultaneously with certainty. The uncertainty principle is a direct consequence of the relationship between observables and commutation in quantum mechanics.</p>

1. What are observables in science?

Observables refer to measurable quantities or properties of a physical system that can be observed and measured through experiments or observations. These can include things like position, velocity, energy, and spin.

2. How are observables related to commutation in science?

Commutation refers to the order in which two operations are performed. In science, observables are said to commute if the order in which they are measured does not affect the outcome. This means that the observables can be measured simultaneously without affecting the results.

3. What is the significance of commutation in quantum mechanics?

In quantum mechanics, commutation is important because it helps determine the fundamental properties of a physical system. If two observables do not commute, it means that they cannot be measured simultaneously with certainty, and this has implications for the uncertainty principle in quantum mechanics.

4. Can all observables be measured simultaneously?

No, not all observables can be measured simultaneously. This is due to the uncertainty principle in quantum mechanics, which states that the more precisely one observable is measured, the less precisely the other can be measured. Observables that do not commute cannot be measured simultaneously with certainty.

5. How do observables and commutation relate to the Heisenberg uncertainty principle?

The Heisenberg uncertainty principle states that it is impossible to know the precise values of certain pairs of observables at the same time. This is because these observables do not commute, and therefore, cannot be measured simultaneously with certainty. The uncertainty principle is a direct consequence of the relationship between observables and commutation in quantum mechanics.

Similar threads

  • Quantum Physics
Replies
17
Views
1K
Replies
19
Views
1K
Replies
4
Views
1K
Replies
11
Views
2K
  • Quantum Physics
Replies
4
Views
946
  • Quantum Physics
Replies
4
Views
1K
  • Quantum Physics
Replies
2
Views
1K
Replies
12
Views
1K
Replies
3
Views
803
  • Quantum Physics
Replies
7
Views
894
Back
Top