What is the Shannon Entropy of the String 'QXZ'?

Karl Coryat · Apr 23, 2013

Shannon entropy of "QXZ"

Hello everyone. I am trying to determine the Shannon entropy of the string of letters QXZ, taking into consideration those letters' frequency in English. I am using the formula:

H(P) = –Ʃ p_ilog₂p_i

What's puzzling me is that I am expecting to calculate a high entropy, since QXZ represents an unexpected string in the context of English letter frequencies -- but the first p_i term in the formula, which takes very small values (e.g., .0008606 for Q), is greatly diminishing my calculations. I am obviously making a wrong assumption here or applying something incorrectly, because as I understand it, letters with high surprisals should increase the entropy of the string, not reduce it.

Thank you in advance for your generous help.

Stephen Tashi · Apr 23, 2013

Shannon entropy is defined for a probability distribution. You are apparently making some sort of assumptions about the probability of a string of letters and trying to apply the formula for Shannon entropy to the probability of that string happening. Shannon entropy can be computed for the probability distribution for all 3 letter strings. (i.e. it applies to a set of probabilities that sum to 1.0) It doesn't apply to one realization of a 3 letter strings taken from that distribution.

Perhaps you should try Kolmogorov complexity if you want to deal with definite strings of letters.

awkward · Apr 25, 2013

Karl,

I don't know much about information theory, but I think the Shannon information content of "Q" in English text is simply [tex]- \log_2(P(Q))[/tex] The formula you quote for H(P) is for the entropy of an "ensemble" (or distribution), e.g. the entropy of a randomly selected letter in English text.

Reference: "Information Theory, Inference and Learning Algorithms" by MacKay (which is available for free download) http://www.inference.phy.cam.ac.uk/itila/

What is the Shannon Entropy of the String 'QXZ'?

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad The problem of points

Graduate Expected numbers of cards of a last color remaining

Undergrad How does axiom of foundation prevent infinite sequence of elements?

Insights Revisiting the Velocity-Time Function

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect