What is the Shannon Entropy of the String 'QXZ'?

In summary, Shannon entropy is a measure of the uncertainty or unpredictability of a probability distribution. In the context of English letter frequencies, the Shannon entropy of the string "QXZ" is expected to be high due to the unexpectedness of these letters. However, the formula for Shannon entropy should be applied to a probability distribution, not a specific string of letters. If you want to deal with definite strings of letters, you may want to consider using Kolmogorov complexity instead.
  • #1
Karl Coryat
104
3
Shannon entropy of "QXZ"

Hello everyone. I am trying to determine the Shannon entropy of the string of letters QXZ, taking into consideration those letters' frequency in English. I am using the formula:

H(P) = –Ʃ pilog2pi

What's puzzling me is that I am expecting to calculate a high entropy, since QXZ represents an unexpected string in the context of English letter frequencies -- but the first pi term in the formula, which takes very small values (e.g., .0008606 for Q), is greatly diminishing my calculations. I am obviously making a wrong assumption here or applying something incorrectly, because as I understand it, letters with high surprisals should increase the entropy of the string, not reduce it.

Thank you in advance for your generous help.
 
Physics news on Phys.org
  • #2
Shannon entropy is defined for a probablity distribution. You are apparently making some sort of assumptions about the probability of a string of letters and trying to apply the formula for Shannon entropy to the probability of that string happening. Shannon entropy can be computed for the probability distribution for all 3 letter strings. (i.e. it applies to a set of probabilities that sum to 1.0) It doesn't apply to one realization of a 3 letter strings taken from that distribution.

Perhaps you should try Kolmogorov complexity if you want to deal with definite strings of letters.
 
  • #3
Karl,

I don't know much about information theory, but I think the Shannon information content of "Q" in English text is simply [tex]- \log_2(P(Q))[/tex] The formula you quote for H(P) is for the entropy of an "ensemble" (or distribution), e.g. the entropy of a randomly selected letter in English text.

Reference: "Information Theory, Inference and Learning Algorithms" by MacKay (which is available for free download) http://www.inference.phy.cam.ac.uk/itila/
 

1. What is Shannon entropy of QXZ?

The Shannon entropy of QXZ is a measure of the uncertainty or randomness in a system that has three possible outcomes, namely Q, X, and Z. It is named after Claude Shannon, who developed the concept in information theory.

2. How is Shannon entropy of QXZ calculated?

The Shannon entropy of QXZ is calculated using the formula H(QXZ) = -P(Q)log2P(Q) - P(X)log2P(X) - P(Z)log2P(Z), where P(Q), P(X), and P(Z) are the probabilities of the outcomes Q, X, and Z respectively.

3. What does the value of Shannon entropy of QXZ represent?

The value of Shannon entropy of QXZ represents the amount of uncertainty or information contained in a system with three possible outcomes. A higher entropy value indicates higher uncertainty and vice versa.

4. What is the relationship between Shannon entropy of QXZ and information gain?

Shannon entropy of QXZ is used to calculate the information gain, which is a measure of how much a particular feature reduces the uncertainty in a system. The higher the information gain, the more significant the feature is in predicting the outcome.

5. How is Shannon entropy of QXZ used in machine learning?

In machine learning, Shannon entropy of QXZ is used as a measure of the impurity or randomness in a dataset. It is used in decision tree algorithms to determine the best feature to split the data on, in order to reduce the entropy and increase the information gain.

Similar threads

  • Introductory Physics Homework Help
Replies
5
Views
1K
  • Beyond the Standard Models
Replies
3
Views
2K
Replies
7
Views
3K
Replies
23
Views
6K
  • Beyond the Standard Models
Replies
4
Views
4K
  • General Math
Replies
13
Views
9K
  • Mechanical Engineering
Replies
6
Views
979
  • Classical Physics
Replies
4
Views
2K
  • Introductory Physics Homework Help
Replies
4
Views
4K
  • Introductory Physics Homework Help
Replies
2
Views
22K
Back
Top