Why do we use log2(1/p(xi)) in Shannon's entropy definition?

dervast · Oct 23, 2006

Hi to everyone i was reading today the wikipedia's article about information entropy
I need some help to understand why in the
http://upload.wikimedia.org/math/6/a/3/6a33010c16b1d526bc5daee924e3d363.png
entropy of an event we use the log2(1/p(xi))
I have read to the article that
"An intuitive understanding of information entropy relates to the amount of uncertainty about an event associated with a given probability distribution. "
Then why only the first part of the equation Sump(xi) is not enough. p(xi) denotes the amount of uncertainty of an event..

fresh_42 · Apr 15, 2019

The entropy is an expectation value, namely ##H(X)=E(X)=-\sum_{s\in S} p(X=s) \cdot \log_2(p(X=s))##; the weighted sum of all informations. The information of an event ##s## over a binary alphabet is defined as ##\log_2 \left( \frac{1}{p(X=s)} \right)##. The rest follows from this.

There are other possible measures of information, but the one above is which Shannon defined and considered.

The logarithm is basically a length, and ##\frac{1}{p(X=s)}## the statistical relevanz: An event which occurs for sure doesn't carry any information. But the less probable an event is, the more information is carried. Imagine an encoded three letter word. An 'x' in the middle carries more information about this word than an 'e' does. There are only a few words like 'axe', but many like 'bet', 'get', 'jet', 'let', 'set', 'men', etc.

Why do we use log2(1/p(xi)) in Shannon's entropy definition?

What is Shannon's entropy definition?

How is Shannon's entropy calculated?

What is the unit of measurement for Shannon's entropy?

What is the significance of Shannon's entropy in information theory?

How is Shannon's entropy used in data compression?

Similar threads

Hot Threads

Recent Insights