# Density Operators

## Main Question or Discussion Point

Hi, there.

I am a little confused about the following statement in Wikipedia. and it's about the density operators.

"...As mentioned above, a state of a quantum system is given by a unit vector in a Hilbert space. More generally, if one has a large number of copies of the same system, then the state of this ensemble is described by a density matrix, ..."

Q: what's the copes of the same system. e.g. (a) One atom's entire energy levels. or (b) a large number of atoms. Which one, (a) or (b) could be the satiation that the density operator is describing. for me, (a) is reasonable.

Related Quantum Physics News on Phys.org
I think what he is referring to is the fact that you need the full density matrix (state vector is not enough) to describe a mixed state. A mixed state is a classical state where you have some probability of being in one state or the other (but not in a quantum superposition) and this classical mixture does in some sence imply a statistical emsemble of many copies. Thus, this is more closely fitting with you b) option.

vanhees71
Gold Member
2019 Award
This is very misleading if not even wrong ;-).

In quantum theory a pure state is represented by a ray in Hilbert space, i.e., by a unit vector in Hilbert space modulo a phase factor. It is very important to keep this subtlety in mind, because it's crucial for the understanding of states that they are defined by a normalized Hilbert space vector modulo a phase. E.g., it immediately makes clear that all half-integer representations of rotations make sense, and indeed the matter surrounding is is built by particles with spin 1/2 (nucleons and electrons).

Equivalently you can define a pure state as being represented by a projection operator
$$\hat{P}_{\psi}=|\psi \rangle \langle \psi \rangle,$$
where $|\psi \rangle$ is an arbitrary unit vector representing the ray that represents the pure state.

A pure state encodes the most complete determination of the system possible. The delicate point of quantum theory now is Born's postulate, i.e., the interpretation of the state as probalistic, i.e., when preparing a system in the most complete way in a pure state, e.g., by a filtering process (von Neumann measurement) that determines a complete set of compatible observables, fixing the state to be represented by a ray where each representant $|\psi \rangle$ is in the common eigenspace of all the self-adjoint operators representing the measured observables.

Then these observables are determined and have the specific values, but other observables are in general indetermined. For any observable you only know probabilities to find a certain value when measuring it. Let $A$ be this observable of interest and $|a,\beta \rangle$ a complete set of eigenvectors of the corresponding self-adjoint operator $\hat{A}$. Then The probability to find the value $a$ when measuring the observable $A$ on a system prepared in the pure state $\hat{P}_{\psi}$ is given by
$$P(a|\psi)=\sum_{\beta} \langle a,\beta|\hat{P}|a,\beta \rangle=\sum_{\beta} |\langle{a,\beta}|\psi \rangle|^2.$$
This is Born's rule.

The relation with the wave-mechanical formulation is that
$\psi(a,\beta)=\langle a,\beta|\psi \rangle$
is the wave function in the $A$ representation.

Now in many situations you do not have determined the state of the system completely, because it is simply too complex to determine a complete set of observables somehow. E.g., for a macroscopic body you have to specify a number of observables (degrees of freedom) at an order of magnitude of $10^{26}$ (Avogadro's number specifying the number of particles contained in 1 mole the substance, which is defined as the number of carbon atoms if you have 12 g carbon at hand).

Then you use a socalled mixed state. This is analogous to classical statistics, where you use a, e.g., a single-particle phase-space distribution function to describe a gas has a whole. In quantum theory this is done by introducing a more general state operator that is not a projection operator as for pure states. It is a positive semidefinite operator self-adjoint $\hat{R}$ which has unit trace,
$$\mathrm{Tr} \hat{R}=1.$$
If $\hat{R}$ is a projection operator, i.e., $\hat{R}^2=\hat{R}$ then you have a pure state, because then you can find a complete set of orthonormalized eigenvectors $|\lambda \rangle$ of $\hat{R}$, where the eigenvalues are $0$ or $1$, because if $\lambda$ is an eigenvector you have
$$\hat{R}^2 |\lambda \rangle=\hat{R} \lambda |\lambda \rangle=\lambda^2 |\lambda \rangle.$$
But on the other hand we have $\hat{R}^2=\hat{R}$, which implies that $\lambda^2=\lambda$, and this equation has only the solutions $0$ and $1$. Now the trace is
$$\mathrm{Tr} \hat{R}=\sum_{\lambda} \langle \lambda|\hat{R}|\lambda \rangle = \sum_{\lambda} \lambda=1.$$
This means that exactly one eigen value must be 1. Let's denote this eigenvector with $|\psi \rangle$. Since the orthonormal set of eigenvectors of a self-adjoint operator is complete this means that
$$\hat{R}=\sum_{\lambda} \lambda |\lambda \rangle \langle \lambda|=|\psi \rangle \langle \psi|,$$
but this is precisely the projection operator of the pure state represented by the ray containing the unit vector $|\psi \rangle$ as explained above.

To understand the more general case of a proper mixed state, i.e., a state represented by a statistical operator for which $\hat{R}^2 \neq \hat{R}$. We consider the following gedanken experiment. Suppose, a physicist (say Alice) prepares particles always in a given set of pure states $\hat{P}_{j}=|\psi_j \rangle \langle |psi_j \rangle$. She sends a lot of such particles to Bob, but doesn't tell him in which of these pure states she prepared each individual particle. The only thing she tells him is that on average she prepares a fraction $P_j$ of particles in the state represented by $\hat{P}_j$. Note that the $|\psi_j \rangle$ are normalized but not necessarily orthogonal to each other. How should Bob then describe the probability to find a value $a$ when measuring the observable $A$? This is answered by Bayes's theorem from probability theory. First of all the probability to measure this value under the contraint that Alice sends him a particle in the state $\hat{P}_j$ the probability is given according to Born's rule, as explained above:
$P(a|\psi_j)=\sum_{\beta} \langle a,\beta|\psi_j \rangle \langle \psi_j | a, \beta \rangle.$
Since $\hat{P}_j$ is prepared with probability $P_j$, assuming that Alice doesn't hide some correlations between her preparations, i.e., if she sends the particles prepared in an independent way, then for Bob the probability to measure $a$ is given by
$$P(a|R)=\sum_j P_j P(a|\psi_j)=\sum_{\beta} \langle a,\beta|\sum_j P_j |\psi_j \rangle \langle \psi_j | a,\beta.$$
Thus defining the statistical operator as
$$\hat{R}=\sum_j P_j |\psi_j \rangle \langle \psi_j |$$
we can write
$$P(a|R)=\langle a,\beta|\hat{R}|a,\beta \rangle.$$
Thus $\hat{R}$ has the same formal meaning as the projection operator representing a pure state.

Further it fulfills the formal properties of a statistical operator as defined above, because $P_j \geq 0$ and $\sum_j P_j=1$. To evaluate the trace, we introduce an arbitrary complete orthonormal system $|n \rangle$. Then we have
$$\mathrm{Tr} \hat{R}=\sum_n \langle n|\hat{R}|n \rangle=\sum_{n,j} P_j \langle n|\psi_j \rangle \langle \psi_j|n \rangle.$$
Now we have
$$\sum_{n} \langle n|\psi_j \rangle \langle \psi_j|n \rangle=\sum_{n} \langle \psi_j|n \rangle\langle n|\psi_j \rangle =\langle \psi_j | \psi_j \rangle=1$$
and thus finally
$$\mathrm{Tr} \hat{R}=\sum_j P_j=1,$$
as it should be.

I hope that now the formalism of general "mixed states" as being described by statistical operators has become somewhat clearer. It's a quite difficult concept and needs some time to be fully understood!

Here are also physically different kinds of probabilities involved. The first kind is a specifically quantum theoretical one and refers to the probabilities for finding a specific result when measuring an observable on a system which we know to be prepared in a given pure state. Here we have the most complete knowledge about the system's state that is possible for a quantum system, but we still only know the probabilities to find a specific value when measuring the observable (except when the measured observable has a determined value due to the state preparation, i.e., if the state vector $|\psi \rangle$ is a eigenvector of the self-adjoint operator representing the measured observable). These probabilities come into play, because according to quantum theory even when we have complete knowledge about the system's state, not all observables have determined values. According to quantum theory it's impossible to prepare a system such that all observables have determined values at once. This is the indeterministic nature of quantum theory and implies a radical change in our view on Nature compared to classical physics. This makes quantum theory to appear "weird" to many people, but it's the best theory we have about Nature, and it has been tested very stringently since it's discovery in 1925 with the result that it describes Nature to very high degree of accuracy.

The second type of probabilities comes into the game if we have a proper mixed state. These were the probabilities $P_j$ in our gedanken experiments above. They come into play, because Bob has not the complete knowledge about the state of the systems Alice prepares for him. This kind of probabilities are also occuring in classical statistical mechanics, i.e., we describe a classical system that is in principle deterministic with probabilities, because we have incomplete knowlege about its state (i.e., about all positions and momenta of all particles making up the classical system). In quantum theory it's the incomplete knowledge about in which pure state the system is prepared.

Of course, in practice one has to find the statistical operator by other means, based on the information one really has. One very powerful method is the information theoretical approach to (quantum) statistics, as developed by Jaynes based on Shannon's information theory and the corresponding interpretation of the entropy as a measure of missing information given some probability distribution. According to this principle one has to choose the probability distribution that maximizes the entropy under the constraint of the given knowledge about the system, because then one has associated probabilities to the situation in question such as not to imply some prejudice, because in the sense of the Shannon-Jaynes entropy it's the probability distribution that maximizes the missing information under the constraints of the factual knowledge about the system. For more details of this approach, see, e.g., my manuscript on Statistical Physics on my home page:

http://fias.uni-frankfurt.de/~hees/publ/stat.pdf

stevendaryl
Staff Emeritus
Here are also physically different kinds of probabilities involved. The first kind is a specifically quantum theoretical one and refers to the probabilities for finding a specific result when measuring an observable on a system which we know to be prepared in a given pure state....

The second type of probabilities comes into the game if we have a proper mixed state. These were the probabilities $P_j$ in our gedanken experiments above....
An interesting thing about the density operator approach is that the two kinds of probabilities cannot be teased apart. For example, the density operators reflecting the following two situations are indistinguishable:

1. The electron is either in a pure state that is spin-up in the x-direction (with 50% probability), or it is in a pure state that is spin-down in the x-direction (with 50% probability).
2. The electron is either in a pure state that is spin-up in the y-direction (with 50% probability), or it is in a pure state that is spin-down in the y-direction (with 50% probability).

vanhees71
Gold Member
2019 Award
This is an interesting point of this pretty delicate discussion. The two cases in question are a bit different than what you seem to imply. Perhaps, I didn't understand your argument properly. What I meant is the following distinctive cases.

One has an ensemble of electrons, each of which was originally unpolarized but has gone through a Stern-Gerlach filter for spin up or spin-down direction in $x$ direction, but we get all electrons. Now we have to distinguish two cases

(a) We know which spin state has been selected, say spin up. Then it's clear that we have to associate the pure state $\hat{P}_{\uparrow}=|\uparrow \rangle \langle \uparrow |$ with the ensemble or electrons.

(b) We do not know, which spin state has been selected, because the apparatus delivers electrons of both directions. Then we have to associate the proper mixed state
$$\hat{R}=1/2 \left (|\uparrow \rangle \langle \uparrow | + |\downarrow \rangle \langle \downarrow | \right ).$$

Of course, both states are easy to dinstinguish by measuring the $x$ component of the spin (e.g., by another Stern-Geralach filter). Then in the first case we'll only get Spin-up electrons, and we realize that indeed the spin $x$-component is determined. In the 2nd case we get 50% spin up and 50% spin down.

Perhaps you refer to still a different case, namely that again we have the same preparation of electrons and take notice of which spin-$x$ component it really has, e.g., we filter out only spin-up for the $x$ component.

If we then measure the $y$ component of the spin, we again get 50% spin up and 50% spin down. Indeed, by only measuring the spin-$y$ component we cannot determine whether we have a pure state of electrons with determined spin-$x$ component or a mixed state of unpolarized electrons.

To completely determine the state, even in the simple case of a spin 1/2, one has to measure not only one observable but sufficiently many usually incompatible observables on an ensemble, i.e., one measures only one spin component at a single electron of the ensemble but makes sufficiently many measurements with differen spin components to get a clear statistics for each case. Then one can distinguish whether one has totally unpolarized electrons or polarized ones.

If I remember right, Ballentine has a through discussion on the determination of the states by measuring sufficiently many observables on an ensemble.

naima
Gold Member
A mixed state may be described by a density matrix or by a set of "probability weighted" state vectors.
The difference is that you can apply a global phase transformation to these vectors with the same weights. this will describe the same mixed state.
So these vectors cannot be measured.
A density matrix may be measured with an apparatus.
go to oregon

stevendaryl
Staff Emeritus
A mixed state may be described by a density matrix or by a set of "probability weighted" state vectors.
The difference is that you can apply a global phase transformation to these vectors with the same weights. this will describe the same mixed state.
So these vectors cannot be measured.
A density matrix may be measured with an apparatus.
go to oregon
The idea of measuring the density matrix is a little subtle. What you can measure is the value of a particular operator. To reproduce a density matrix, you have to repeat the setup, with the same density matrix, many many times and measure several different operators, so that you can compute expectation values.

However, the weird thing about density matrices is that the weights represent ignorance about the initial conditions. How can you measure ignorance? So if you construct a system that either produces an electron with spin-up in the z-direction, with probability $p$, or spin-down in the z-direction, with probability $1-p$, then you could represent that by a density matrix $\rho= p |U\rangle \langle U| + (1-p) |D\rangle \langle D|$.

To measure that density matrix, you have to use the assumption that in the limit:

$tr(\rho O) = \dfrac{1}{N} \sum O_j$

where $O$ is an observable, and $O_j$ is the result of the jth measurement of $O$.

Here's the problem: Are all the trials independent? What I mean by that is: If the probability of getting, say, spin-up in the z-direction in the first run is $p$, does that mean that the probability of getting spin-up in the z-direction in the first two runs $p^2$? If the $p$ just represents ignorance about the true state, then you don't know whether the runs are independent, or not. For example, suppose that there are two varieties of sources of electrons. Variety 1 always produces spin-up in the z-direction. Variety 2 always produces spin-down. But you don't know which variety you have.

So the interpretation of the density matrix as representing probabilities due to ignorance seems not quite right.

Fredrik
Staff Emeritus
Gold Member
Hi, there.

I am a little confused about the following statement in Wikipedia. and it's about the density operators.

"...As mentioned above, a state of a quantum system is given by a unit vector in a Hilbert space. More generally, if one has a large number of copies of the same system, then the state of this ensemble is described by a density matrix, ..."

Q: what's the copes of the same system. e.g. (a) One atom's entire energy levels. or (b) a large number of atoms. Which one, (a) or (b) could be the satiation that the density operator is describing. for me, (a) is reasonable.
It's a strange thing to say, since the state of every quantum system is a "density matrix". (I will just call it a "state" from now on). Each copy of that system has a state operator, and the composite system has a state operator too. For example, if you prepare two qubits in state |0>, then each of them is in the state |0><0|, but the two-qubit system is in state $$\big(|0\rangle\otimes|0\rangle\big)\big(\langle 0|\otimes\langle 0|\big) =|0\rangle\langle 0|\otimes |0\rangle\langle 0|.$$ I think that what they're trying to say is that while a "pure" state $|\psi\rangle\langle\psi|$ can be represented by a vector $|\psi\rangle$, a "mixed" state $\sum_i c_i|\psi_i\rangle\langle\psi_i|$ (with $0\leq c_i\leq 1$ for all i, and $\sum_i c_i=1$) can't be represented by a vector unless all but one of the $c_i$ are zero.

Unit vectors represent preparation procedures. Two unit vectors in the same 1-dimensional subspace represent the same preparation procedure. So 1-dimensional subspaces can also represent preparation procedures, and they do it better. There's a bijective correspondence between (Hilbert) subspaces and projection operators. A projection operator for a 1-dimensional subspace can always be written $|\psi\rangle\langle\psi|$, where $|\psi\rangle$ is any unit vector in the subspace. Since these projection operators correspond bijectively to 1-dimensional subspaces, we can say that they represent preparation procedures.

Now consider n different preparation procedures that correspond to projection operators $|\psi_i\rangle\langle\psi_i|$, and suppose that you use a random number generator to determine which of the n preparation procedures you should use to prepare the system. Suppose that you tell your friend who has to do a measurement on the system that this is what you did, but you don't tell him which one of the procedures you ended up using. Then this friend should use the state $\rho=\sum_{i=1}^n|\psi_i\rangle\langle\psi_i|$, where each $c_i$ is the probability that you used the ith preparation procedure.

From your friend's point of view, $\rho$ is the state of the system. The whole process that you went through to prepare the system is strange and complicated, but it's still a preparation procedure. So $\rho$ represents a preparation procedure too, just like each of the $|\psi_i\rangle$. It's just a more complicated procedure. This suggests that it's kind of silly to assign the "pure" states (the ones that correspond to projection operators of 1-dimensional subspaces) such a privileged role. It makes more sense to use the term "state" for positive operators of trace 1 than for unit vectors, since a "state" is supposed to be a representation of a preparation procedure.

There are a number of issues associated with the naive interpretation that a state represents all the properties of the system (regardless of whether its a pure state or not), but it appears to be safe to think of a state as representing all the properties of the "ensemble" that consists of all the copies of the system that participate in the experiment if you run it multiple times, one after another, each time with only one copy of the system participating in the experiment. Note however that this sort of statement isn't part of QM, it's part of an interpretation of QM.

It's definitely not the case that $\rho$ represents the state of the N-particle system you get if you use this complicated preparation procedure on N particles. That state is $\rho\otimes\cdots\otimes\rho$ (N factors).

One more thing that's good to know: If you're given a state $\rho=\sum_{i=1}^n|\psi_i\rangle\langle\psi_i|$, then you can't conclude that the system is really in one of the states $|\psi_i\rangle$. Consider a qubit in state $\frac 1 2\big(|0\rangle\langle 0|+|1\rangle\langle 1|\big)$. Define two new vectors $|+\rangle$ and $|-\rangle$ by $|\pm\rangle=\frac{1}{\sqrt 2}\big(|0\rangle \pm|1\rangle\big)$. We have
$$|\pm\rangle\langle\pm|=\frac 1 2\big(|0\rangle\langle 0| \pm |0\rangle\langle 1| \pm |1\rangle\langle 0| + |1\rangle\langle 1|\big),$$ and therefore
$$|+\rangle\langle +|+|-\rangle\langle -| = |0\rangle\langle 0|+|1\rangle\langle 1|.$$

naima
Gold Member
As you can measure a density matrix,you will find its eigenvalues (the probability of the pure sates) and calculate the Von Neumann entropy. So you will measure your ignorance.

stevendaryl
Staff Emeritus
As you can measure a density matrix,you will find its eigenvalues (the probability of the pure sates) and calculate the Von Neumann entropy. So you will measure your ignorance.
I'm saying that you can't measure the density matrix, and I explained why not. Not if it only represents ignorance about preparation procedures, anyway. Not without making further assumptions about the nature of the probabilities.

Look at the example I gave. Suppose you know that some source of electrons has a probability of $p$ of producing spin-up electrons (in the z-direction) and a probability $1-p$ of producing spin-down electrons. That corresponds to the density matrix (in the representation where $S_z$ is diagonal).

$\rho = \left( \begin{array}{cc} p & 0 \\ 0 & (1-p) \end{array}\right)$

Suppose that you don't know $p$, and you want to find it out experimentally. How do you do it?

Well, here is a recipe, which I'm saying is wrong:

Repeat the experiment many times. Compute $\langle S_z \rangle$. That should be equal to $tr(\rho S_z) = \frac{1}{2} (p - (1-p)) = p - \frac{1}{2}$. So the expectation of $S_z$ tells you what $p$ is. (It only takes one expectation value, because we assumed that the matrix was diagonal in the $S_z$ representation. If we didn't assume this, we would have to take expectation values for two different noncommuting operators.)

Why am I saying this is wrong? Because you can't directly measure expectation values. You can certainly measure averages: $\langle S_z\rangle_{avg} = \frac{1}{N}\sum_{i=1}^N r_i$ where $r_i$ is the $i^{th}$ spin measurement in the z-direction. But this only approaches the expectation value if the probabilities on each trial are independent.

When I said that the preparation procedure either produces spin-up, with probability $p$, or spin-down, with probability $1-p$, that is not enough information to know whether the probabilities for repeated trials are independent, or not. Maybe there are two different electron source devices, one that always produces spin-up, and one that always produces spin-down, and I don't know which one the store sent me, so I use probabilities to reflect that ignorance. In that case, measuring average values for $S_z$ will give $\frac{+1}{2}$, if I got the first type of source, or $\frac{-1}{2}$, if I got the second type of source. It would only be equal to the expectation value in the case where each trial was independent, which means that every time I start from scratch, ordering an electron source from the factory.

So I'm saying that unless you know something about the nature of the probabilities involved, you can't measure expectation values*.

*People sometimes use "expectation value" to mean the expression $tr(\rho S_z)$, which is a thing which you can calculate from the theory, and sometimes use "expectation value" to mean $\frac{1}{N}\sum_{i=1}^N r_i$. The two expressions are only equal under certain circumstances, namely that $N$ is sufficiently large, and the theory is correct, and nothing weird has happened, and the probabilities for each trial are independent.

naima
Gold Member
I suppose that the trial are not independant. The experimentalist choose a given direction He builds three sources: one for up and two for down in that direction.
The agorithm is:
first toss a coin to choose one of the two "down" sources.
then for each trial toss a coin to choose another source no repeated source.

a) the direction which was chosen
b) the ratio of up and down.

Is there something you do not agree?

stevendaryl
Staff Emeritus
I suppose that the trial are not independant. The experimentalist choose a given direction He builds three sources: one for up and two for down in that direction.
The agorithm is:
first toss a coin to choose one of the two "down" sources.
then for each trial toss a coin to choose another source no repeated source.

a) the direction which was chosen
b) the ratio of up and down.

Is there something you do not agree?
I already explained what was wrong. The claim they make (near the bottom of page 5) is:
By measuring the observables $\langle S_x \rangle$, $\langle S_y \rangle$, $\langle S_z \rangle$ we can construct a spin 1/2 density matrix.
I'm saying that $\langle S_z\rangle$ is not observable. What you can observe is

$\langle S_z \rangle_{average} = \frac{1}{N} \sum_{i=1}^N r_i$, where $r_i$ is the $i^{th}$ spin measurement result in the z-direction. You don't know that that is equal to $tr(\rho S_z)$ without knowing about the independence of the probabilities.

Fredrik
Staff Emeritus
Gold Member
Maybe there are two different electron source devices, one that always produces spin-up, and one that always produces spin-down, and I don't know which one the store sent me, so I use probabilities to reflect that ignorance. In that case, measuring average values for $S_z$ will give $\frac{+1}{2}$, if I got the first type of source, or $\frac{-1}{2}$, if I got the second type of source. It would only be equal to the expectation value in the case where each trial was independent, which means that every time I start from scratch, ordering an electron source from the factory.
Isn't that also what's required for you to keep using the state $\rho$? If you use a device that always produces the same pure state, and p just reflects your ignorance of what pure state that is, then you need to use the result of the first measurement to adjust the value of p to be used in the second experiment. You wouldn't have the state $\rho$ every time.

So is someone claiming that they can measure any given $\rho$ that's produced only once? If not, your argument doesn't seem to apply.

stevendaryl
Staff Emeritus
Isn't that also what's required for you to keep using the state $\rho$? If you use a device that always produces the same pure state, and p just reflects your ignorance of what pure state that is, then you need to use the result of the first measurement to adjust the value of p to be used in the second experiment. You wouldn't have the state $\rho$ every time.

So is someone claiming that they can measure any given $\rho$ that's produced only once? If not, your argument doesn't seem to apply.
Yes, the density matrix is measurable under the assumption that the preparation state is reproducible, and the trials are independent. Maybe the latter is implicit in saying that the preparation state is reproducible? But if the "state" reflects our ignorance, then I guess to say that the state is reproducible means that each trial, we don't know any more than we did on the last. Hmm.

bhobba
Mentor
To the OP.

This has been an interesting thread.

Personally though I dont look at it that way. I start with the definition of a state as a positive operator P of unit trace. Let that sink in a bit - its not an element of a vector space as some texts will tell you - it an operator. That way you, right from the beginning, avoid any issues of what a state is and you don't have discussions about what a density matrix is. Its the representation of that operator in a particular basis, but don't worry about that to start with. Its relevance lies in the Born rule, the correct statement of which is given an observable O the expected value of the observable E(O) = Trace (PO).

In this view by definition a pure state is a state of the form |u><u| ie a projection operator. A mixed state is a convex sum of pure states. It can be shown all states are either mixed or pure - but mixed states are not uniquely decomposable into pure states. What is usually done for pure states is to map the u in |u><u| to a vector space. But it must be remembered that can only be done for pure states and in reality it is an operator, which shows itself in the gauge freedom of such a mapping. This freedom is if c is a phase factor |cu><cu| = |u><u| and also why the superposition principal is trivially true.

You can do a bit of mucking around with the Born rule and show for mixed states ∑ pi |bi><bi| it behaves exactly the same as if you were randomly presented with the pure state |bi><bi| to observe with probability pi. If that was the case such mixed states are called proper - if not they are called improper. Observationally there is no way to tell the difference though - its purely in how they were prepared. This is a very important point because it is at the heart of decoherence and the measurement problem.

I cant go without mentioning my personal favourite way of deriving the Born rule itself, as well as why states exist in the first place:

Of relevance to this thread is that derivation shows the existence of states as positive operators of unit trace. In a sense they are more fundamental than pure states.

Thanks
Bill

Last edited:
bhobba
Mentor
If I remember right, Ballentine has a through discussion on the determination of the states by measuring sufficiently many observables on an ensemble.
You remember right - see chapter 8.

Of relevance to this discussion Ballentine starts out with the definition of states as positive operators of unit trace right from the start.

Its one reason that book was such a revelation to me - starting with that avoided, for me anyway, any misunderstandings etc. I always go back to that definition and the Born rule as E(O) = trace (PO) and it's surprising exactly what subtle fine points like the difference between proper and improper mixed states become instantly clear.

Thanks
Bill

Last edited:
stevendaryl
Staff Emeritus
You remember right - see chapter 8.

Of relevance to this discussion Ballentine starts out with the definition of states as positive operators of unit trace right from the start.

Its one reason that book was such a revelation to me - starting with that avoided, for me anyway, any misunderstandings etc. I always go back to that definition and the Born rule as E(O) = trace (PO) and it's surprising exactly what subtle fine points like the difference between proper and improper mixed states become instantly clear.

Thanks
Bill
I know that an "improper" mixture is a mathematical construction---you trace out the degrees of freedom due to environment (or whatever other subsystem is not convenient to measure) and this process converts an entangled pure state into a mixture.

But what is a "proper" mixture? Does that require a source of (classical) randomness? Or does it result from ignorance of details of the system's state?

naima
Gold Member
I start with the definition of a state as a positive operator P of unit trace. Let that sink in a bit - its not an element of a vector space as some texts will tell you - it an operator.
Yes but in the convex set of positive operators the trace $Tr(\rho O)$ may be seen as a scalar product of $\rho$ and O

Is this correct?

This is the one who start this discussion. After I read the above replies, I try to think it by myself. and let's see whether I got it right?

Density operators are used to describe system (A) that is composed by some subsystems (a1, a2, a3...ai....). The mechanics that the subsystems(...ai...) compose the whole system (A) obey the classic statistics. for example: If we just have two subsystems

$|\Psi\rangle \langle \Psi|=0.3 |a_1\rangle \langle a_1| + 0.7 |a_2\rangle \langle a_2|$ (*)

Well, the subsystems are the quantum systems.

So, for just one atom, The above state $|\Psi\rangle \langle \Psi|$ can not be realized, because, if $|a_i\rangle$ is the the eigenstate of same operator, then $0.3^2+0.7^2\neq 1$.

If we are considering a large number of atoms, this state can be realized. we can prepare those atoms in this particular way, which is given by the eq.(*). The 30% of the atoms are prepared in state $|a_1\rangle$, and 70% of the atoms are prepared in state$|a_2\rangle$. For example, One can separate 30% of the atoms from the 70%, and excite them by laser, and make them all in same metastable state, and those 70% remain in the groud state, and mix them together.

Am i getting it correct?

Fredrik
Staff Emeritus
Gold Member
Density operators are used to describe system (A) that is composed by some subsystems
Not necessarily. Even when the system is a single elementary particle, there are preparation procedures that can't be represented by a state vector, or equivalently, by a pure state operator. Those preparation procedures can however be represented by a state operator (density operator) that isn't pure.

$|\Psi\rangle \langle \Psi|=0.3 |a_1\rangle \langle a_1| + 0.7 |a_2\rangle \langle a_2|$ (*)

Well, the subsystems are the quantum systems.

So, for just one atom, The above state $|\Psi\rangle \langle \Psi|$ can not be realized, because, if $|a_i\rangle$ is the the eigenstate of same operator, then $0.3^2+0.7^2\neq 1$.
The right-hand side is fine, but if we denote it by $\rho$, we don't have $\rho^2=\rho$, and this means that there can't exist a $|\Psi\rangle$ such that $\rho=|\Psi\rangle\langle\Psi|$.

If we are considering a large number of atoms, this state can be realized.
My comment above applies regardless of what physical system we're talking about.

bhobba
Mentor
But what is a "proper" mixture? Does that require a source of (classical) randomness? Or does it result from ignorance of details of the system's state?
Its one where some external agency has randomly presented a pure state to observed.

The nature of the external agency is unimportant eg it may be a choice made by someone by flipping a coin, tossing a dice, a pseudo random number generator, it doesn't matter.

There is of course a deeper issue here, not relevant to the essential question, and that is exactly where does classical randomness come from.

Thanks
Bill

bhobba
Mentor
Density operators are used to describe system (A) that is composed by some subsystems (a1, a2, a3...ai....).
Nope .

Its a positive operator of unit trace. That's it, that's all. As I said previously forget what you have read before, purge it from your mind, and let that sink in.

After that take on board its physical significance which is via the Born Rule - E(O) = Trace (PO). You can then play with the formalism and discover all sorts of stuff - but its simply a logical consequence of what I said above.

Thanks
Bill

bhobba
Mentor
Yes but in the convex set of positive operators the trace $Tr(\rho O)$ may be seen as a scalar product of $\rho$ and O
Personally I haven't seen that form. In the vector space of operators trace is the usual inner product so in that sense it's a scalar product but its not the usual form.

Trace (PO) is certainly what comes out of Gleason.

Thanks
Bill

stevendaryl
Staff Emeritus
Its one where some external agency has randomly presented a pure state to observed.

The nature of the external agency is unimportant eg it may be a choice made by someone by flipping a coin, tossing a dice, a pseudo random number generator, it doesn't matter.

There is of course a deeper issue here, not relevant to the essential question, and that is exactly where does classical randomness come from.

Thanks
Bill
That's sort of what I was getting at. Classically, all probabilistic behavior is due to ignorance of the details of the initial conditions. That would seem to make the probabilities subjective, rather than objective. But I don't see how subjective probabilities can be the basis of density matrices, which have to be dependably reproducible (at least if they are to be measurable).

So maybe it needs to be a pseudo-random process that is assumed to produce a sequence with all the right frequencies, but too high Kolmogorov complexity to be predictable.

The right-hand side is fine, but if we denote it by $\rho$, we don't have $\rho^2=\rho$, and this means that there can't exist a $|\Psi\rangle$ such that $\rho=|\Psi\rangle\langle\Psi|$.
I might get a little closer,
we can just write

$\hat{\rho}=0.3|a_1\rangle \langle a_1|+0.7|a_2\rangle \langle a_2|$

and this $\hat{rho}$ is not an outer product of any vectors in the Hilbert space, because it is a classic statistic thing, right?

and How can we create a mix state with just one atom?
I used to believe that all the states of an atom is represented by a vector in Hilbert space, no matter how we intervened the system (the atom)