Hm, it's a bit bad to have this thread put to B-level. It's nearly impossible to answer it at B-level. So let me give an answer at I-level.
Within a modern approach, entropy is understood as a measure for the missing information, given a probatility distribution.
It turns out that the measure for a discrete set of possible outcomes in a random experiment with equal probability for each outcome (like throughing a fair dice, where the probability for each outcome is 1/6), i.e., ##P_i=1/N##, where ##N## is the number of possible outcomes is
$$S=k_{\text{B}} \ln N.$$
Starting from this result by Shannon, you can derive the more general result that for a given probabilities ##P_i##
$$S=-k_{\text{B}} \sum_i P_i \ln(P_i).$$
It is understood here that for ##P_i=0## one has to put ##P_i \ln(P_i)=0##.
In quantum mechanics the probability distributions are given by the Statistical Operator, which represents the state, and then this formula writes (first established by von Neumann)
$$S=-k_{\text{B}} \mathrm{Tr}(\hat{\rho} \ln \hat{\rho}).$$
For the special case that the system is prepared in a pure state one has ##\hat{\rho}=|\Psi \rangle \langle \Psi|##. Now you can use a complete orthonormal set ##|u_i \rangle## containing ##|u_1 \rangle=|\Psi \rangle## as a member, and then the trace becomes
$$S=-k_B 1 \ln 1=0,$$
which tells you that the knowledge that the system is prepared in a pure state is complete knowledge, as it should be in QT.
For more details about the information-theoretical approach to statistical physics, see either the Book by Jochen Rau, I've recommended before. I've also a short manuscript using this concept:
https://th.physik.uni-frankfurt.de/~hees/publ/stat.pdf