I am reading a book called 'quantum processes systems and information' and in the beggining a basic idea of information is set out in the following way. If information is coded in 'bits', each of which have two possible states, then the number of possible different messages/'values' that can be encoded in an 'n' bit system is 2^n. If it is known that a message (to be received in the future) has M values, then the entropy is defined of log(M) = H where the logarithm is of base 2 (because its a binary system). Effectively, the entropy of a message with M possible values tells you the minimum number of bits required to represent the message. However, this assumes that each possible 'value' of the message has an equal probability of being read. It is later discussed that if message with 2 possible values is being read with the probability of each message not being equal, then the entropy is defined in a different way. If, say, the first message (0) has a probability of 1/3, and the second (1) of 2/3 then the second message is 'split' into two messages (say, 1a and 1b) each of which having a probability of 1/3. The entropy is then defined in the following way: if H(M) = entropy of message M, and P(x) = probability of value x then --> H(0,1a,1b) = H(0,1) + P(1)H(1a,1b). Does anybody have an intuitive understanding of this definition of entropy for the second case (with unequal probabilities for each value of the message) that they could explain to me? Danke.