Why do we use log2(1/p(xi)) in Shannon's entropy definition?

In summary, the concept of information entropy measures the amount of uncertainty associated with an event based on its probability distribution. This is represented by the equation ##H(X)=E(X)=-\sum_{s\in S} p(X=s) \cdot \log_2(p(X=s))## and is calculated using the logarithm of the inverse of the event's probability. This measure was defined by Shannon and is based on the idea that less probable events carry more information.
  • #1
dervast
133
1
Hi to everyone i was reading today the wikipedia's article about information entropy
I need some help to understand why in the
http://upload.wikimedia.org/math/6/a/3/6a33010c16b1d526bc5daee924e3d363.png
entropy of an event we use the log2(1/p(xi))
I have read to the article that
"An intuitive understanding of information entropy relates to the amount of uncertainty about an event associated with a given probability distribution. "
Then why only the first part of the equation Sump(xi) is not enough. p(xi) denotes the amount of uncertainty of an event..
 
Mathematics news on Phys.org
  • #2
The entropy is an expectation value, namely ##H(X)=E(X)=-\sum_{s\in S} p(X=s) \cdot \log_2(p(X=s))##; the weighted sum of all informations. The information of an event ##s## over a binary alphabet is defined as ##\log_2 \left( \frac{1}{p(X=s)} \right)##. The rest follows from this.

There are other possible measures of information, but the one above is which Shannon defined and considered.

The logarithm is basically a length, and ##\frac{1}{p(X=s)}## the statistical relevanz: An event which occurs for sure doesn't carry any information. But the less probable an event is, the more information is carried. Imagine an encoded three letter word. An 'x' in the middle carries more information about this word than an 'e' does. There are only a few words like 'axe', but many like 'bet', 'get', 'jet', 'let', 'set', 'men', etc.
 

What is Shannon's entropy definition?

Shannon's entropy definition, also known as Shannon information or Shannon entropy, is a measure of the amount of uncertainty or randomness in a system. It was first proposed by mathematician Claude Shannon in 1948.

How is Shannon's entropy calculated?

Shannon's entropy is calculated using the formula H = -Σp(x)log2p(x), where p(x) is the probability of a particular event occurring. This formula takes into account the probability of each event and the amount of information each event carries.

What is the unit of measurement for Shannon's entropy?

Shannon's entropy is typically measured in bits, which represents the amount of information needed to represent one of two equally likely outcomes. However, it can also be measured in nats, which is the unit of measurement for natural logarithms.

What is the significance of Shannon's entropy in information theory?

Shannon's entropy is a fundamental concept in information theory, which is the study of how information is produced, transmitted, and received. It provides a way to quantify the uncertainty or randomness in a system and is used in various fields such as computer science, statistics, and physics.

How is Shannon's entropy used in data compression?

In data compression, Shannon's entropy is used to estimate the minimum number of bits needed to represent a message. This helps determine the efficiency of compression algorithms and the amount of data that can be transmitted over a communication channel without errors.

Similar threads

  • Programming and Computer Science
Replies
9
Views
3K
  • Advanced Physics Homework Help
Replies
5
Views
1K
  • General Math
Replies
13
Views
9K
Replies
5
Views
9K
Replies
0
Views
2K
Replies
17
Views
4K
  • Special and General Relativity
Replies
6
Views
3K
  • Quantum Physics
Replies
3
Views
5K
Back
Top