DarMM said:
Yes, in Quantum Information theory the Shannon entropy is the entropy of the classical model induced by a context. So it naturally depends on the context. I don't see why this is a problem, it's a property of a context. There are many information theoretic properties that are context dependent in Quantum Information.
Von Neumann entropy is a separate quantity and is a property of the state, sometimes called Quantum Entropy and is equal to the minimum Shannon entropy taken over all contexts.
I don't see that what
@vanhees71 and I are saying is that different. He's just saying that the von Neumann entropy is the quantum generalization of Shannon entropy. That's correct. Shannon entropy is generalized to the von Neumann entropy, but classical Shannon entropy remains as the entropy of a context.
It's only Peres's use, referring to the entropy of the distribution over densities, that seems nonstandard to me.
Of course, the entropy measure depends on the context. That it's strength! It's completely legitimate to define an entropy ##H## and also obviously useful in some investigations is quantum informatics as Peres. To avoid confusion, I'd not call it Shannon entropy.
Let me try again, to make the definition clear (hoping to have understood Peres right).
Peres describes the classical gedanken experiment to introduce mixed states and thus the general notion of quantum state in terms of a statistical operator (which imho should be a self-adjoint positive semidefinite operator with trace 1): Alice (A) prepares particles in pure states ##\hat{P}_n=|u_n \rangle \langle u_n|##, each with probability ##p_n##. The ##|u_n \rangle## are normalized but not necessarily orthogonal to each other. The statistical operator associated with this situation is
$$\hat{\rho}=\sum_n p_n \hat{P}_n.$$
Now Peres defines an entropy by
$$H=-\sum_n p_n \ln p_n.$$
This can be analyzed using the general scheme by Shannon. Entropy in Shannon's sense is a measure for the missing information given a probability distribution relative to what's considered complete information.
Obviously Peres takes the ##p_n## as the probability distribution. This distribution describes precisely the situation of A's preparation process: It describes the probability that A prepares state ##\hat{P}_n##, i.e., an observer Bob (B) uses ##H## as the entropy measure if he knows that A prepares the specific states ##\hat{P}_n##, each with probability ##p_n##. Now A sends him such a state. For B complete information would be to know which ##\hat{P}_n## this is, but he doesn't know it but only the probability ##p_n##. That's why B uses ##H## as the measure for missing information.
Now the mixed state ##\hat{\rho}## defined above describes something different. It provides the probability distribution for any possible measurement on the system. Complete information in QT means that we measure precisely (in the old von Neumann sense) a complete set of compatible observables ##O_k##, represented by self-adjoint operators ##\hat{O}_k## with orthonormalized eigenvectors ##|\{o_k \} \rangle##. If we are even able to filter the systems according to this measurement we have prepared the system as completely as one can according to QT, namely in the pure state ##\hat{\rho}(\{o_k \})=|\{o_k \} \rangle \langle \{o_k \}|##.
The probabilities for the outcome of such a complete measurement are
$$p(\{o_k \})=\langle \{o_k \} |\hat{\rho}|\{o_k \} \rangle.$$
Relative to this definition of "complete knowledge", given A's state preparation described by ##\hat{\rho}## B associates with this situation the entropy
$$S=-\sum_{\{o_k \}} p(\{o_k \}) \ln p(\{o_k \}).$$
Now it is clear that this entropy is independent of which complete set of compatible observables B chooses to define what's complete knowledge in this quantum-theoretical sense means, since obviously this entropy is given by
$$S=-\mathrm{Tr} (\hat{\rho} \ln \hat{\rho}).$$
This is the usual definition of the Shannon-Jaynes entropy in quantum theory, and it's identical with von Neumann's definition by this trace. There's no contradiction between ##H## and ##S## of any kind, it are just entropies in Shannon's information theoretical sense referring to different information about the same preparation procedure.
One has to keep in mind, to which "sense of knowledge" the entropy refers to, and no confusion can occur. As I said before, I'd not call H the Shannon entropy to avoid confusion, but it's fine as a short name for what Peres clearly defines.