- #26

- 15,953

- 7,257

Here you take an observable represented by the operator

$$\hat{K}=\sum_k k |\langle k \rangle \langle k |.$$

Your random experiment here is the measurement of the observable ##K## on a system prepared in a pure state represented by ##\hat{\rho}=|\psi \rangle \langle \psi|##. This defines the probabilities for finding a given value ##k## as

$$p_k=|\langle k|\psi \rangle|^2.$$

Thus you define for this random experiment complete knowledge to mean knowledge of the value ##k##. That's why in general the entropy for this random experiment is

$$\tilde{S}_k=-\sum_k p_k \ln p_k,$$

which is indeed ##0## if and only if only one ##p_k=1## and all others 0, i.e., if the state is prepared in the pure state ##\hat{\rho}_k=|k \rangle \langle k|##. Thus the entropy principle is applied correctly to this random experiment.

It's of course not the von Neumann entropy, which defines another entropy for another "random experiment". It does not ask for the specific value of a specific observable (or a set of observables) but it asks the question of "state determination", i.e., it considers all possible quantum states ##\hat{\rho}## (i.e., pure and mixed).

Here the question is what are the states of complete knowledge of the system as a whole, not specified knowledge about the value of an observable. It's important to get this clear, because that's what really distinguishes the notion of state in the classical sense (complete knowledge means to know the always determined values of all possible observables) and of state in the quantum sense, where complete knowledge means that the system is prepared in a pure state.

This is equivalent to have determined values for some complete set of compatible observables. That's what makes QT really distinct from classical physics: The determination of the values of two observables can be impossible, i.e., in general one cannot prepare the stystem in a way that both observable take determined values. Thus one has to define the compatibility of observables: Two observables ##A## and ##B## are compatible iff you can, for any value ##A## can take (one of the eigenvalues ##a## of the operator ##\hat{A}## representing the observable) and any value ##B## can take, there's at least one state, where ##A## and ##B## take with certainty the given possible values ##a## and ##b##.

It's much simpler to state the math, but it follows from the above physical definition: Observables are compatible if there's a complete set of common eigenvectors of their corresponding operators, which is equivalent to that these operators commute.

A complete set of compatible observables is one that determines their common eigenvectors uniquely (up to a phase factor of course).

That's why von Neumann entropy is the right "measure of missing information" (I found the above state as a measure for the "surprise" in getting some result from a random experiment also a nice metaphor to get an intuitive idea about entropy as information measure) in this question of what's the most complete information you can have for a quantum system in general. It turns out that these are the pure states, corresponding to the determination of the values of a complete set of compatible observables when the system is prepared in such a state.

The above example is of course not an exception. It just specifies in more detail, what you are interested in, namely in a given specific observable. If the spectrum is non-degenerate it's a complete set and saying to have prepared the system such that the observable takes a definite value is equivalent to preparing it in the corresponding pure state (i.e., the stat. op. is the projector defined by the eigenvector to this eigenvalue).

If the spectrum is degenerate, just knowing that the observable takes a determined value ##a##, is incomplete knowledge. There are then many states describing the situation, and the question is, what's the best guess for it. Here, the maximum-entropy principle helps: One assigns that state operator to the situation that takes into account the information you have, but no further bias or prejudice.

Now there's an orthonormal set ##|a,\alpha \rangle## of eigenvectors to the eigenvalue ##a## of the operator ##\hat{A}##. Since it's known that the value of ##A## is with certainty ##a##, the probability for finding any other value ##a' \neq a## must vanish, i.e., ##\langle a',\alpha|\hat{\rho}|a',\alpha =0##. Since this is valid for any linear combination of vectors spanned by the ##|a',\alpha \rangle## with ##a'\neq a##, the statistical operator must be of the form

$$\hat{\rho}=\sum_{\alpha_1,\alpha_2} p_{\alpha_1 \alpha_2} |a,\alpha_1 \rangle \langle |a,\alpha_2 \rangle.$$

Since the matrix elements with respect to all ##|a',\alpha \rangle## with ##a \neq a'## thus vanish, the eigenvectors of ##\hat{\rho}## must be in the eigenspace ##\text{Eig}(a,\alpha)## of ##\hat{A}##, and since ##\hat{A} \hat{\rho}=a \hat{\rho}, \quad \hat{\rho} \hat{A}=\hat{\rho} \hat{A}^{\dagger}=a \hat{\rho}## we have ##[\hat{A},\hat{\rho}]=0## and thus there's a common set of eigenvectors of ##\hat{A}## and ##\hat{\rho}##, and this must span thus ##\text{Eig}(a,\alpha)##. We can assume that the ##|a,\alpha \rangle## are those common eigenvectors, and thus the stat. op. simplifies to

$$\hat{\rho}=\sum_{\alpha} p_{\alpha} |a,\alpha \rangle \langle a,\alpha|.$$

The entropy

$$S=-\sum_{\alpha} p_{\alpha} \ln p_{\alpha}$$

must be maximized under the contraint ##\sum_{\alpha} p_{\alpha}=1##, i.e., with the Lagrange parameter for this constraint we find

$$\delta [S+\Omega(\sum_{\alpha} p_\alpha -1)]=0,$$

where now the ##p_{\alpha}## can be varied independently. Thus we have

$$\sum_{\alpha} \delta p_{\alpha} (-\ln p_{\alpha}-1+\Omega)=0$$

leading to

$$\ln p_{\alpha} =\Omega-1=\text{const}.$$

Thus all the ##p_{\alpha}## must be equal, i.e., if the eigenvalue ##a## is ##d_a## fold degenerate, you must have

$$p_{\alpha}=\frac{1}{d_{\alpha}},$$

and finally

$$\hat{\rho}=\frac{1}{d_{\alpha}} \sum_{\alpha=1}^{d_{\alpha}} |a,\alpha \rangle \langle a,\alpha|.$$

For the special case that the eigenvalue is not degenerate, i.e., ##d_{\alpha}=1## you are again back at the corresponding pure state as discussed above.