What is the Shannon Entropy of the String 'QXZ'?

Karl Coryat · Apr 23, 2013

Shannon entropy of "QXZ"

Hello everyone. I am trying to determine the Shannon entropy of the string of letters QXZ, taking into consideration those letters' frequency in English. I am using the formula:

H(P) = –Ʃ p_ilog₂p_i

What's puzzling me is that I am expecting to calculate a high entropy, since QXZ represents an unexpected string in the context of English letter frequencies -- but the first p_i term in the formula, which takes very small values (e.g., .0008606 for Q), is greatly diminishing my calculations. I am obviously making a wrong assumption here or applying something incorrectly, because as I understand it, letters with high surprisals should increase the entropy of the string, not reduce it.

Thank you in advance for your generous help.

Stephen Tashi · Apr 23, 2013

Shannon entropy is defined for a probablity distribution. You are apparently making some sort of assumptions about the probability of a string of letters and trying to apply the formula for Shannon entropy to the probability of that string happening. Shannon entropy can be computed for the probability distribution for all 3 letter strings. (i.e. it applies to a set of probabilities that sum to 1.0) It doesn't apply to one realization of a 3 letter strings taken from that distribution.

Perhaps you should try Kolmogorov complexity if you want to deal with definite strings of letters.

awkward · Apr 25, 2013

Karl,

I don't know much about information theory, but I think the Shannon information content of "Q" in English text is simply - \log_2(P(Q)) The formula you quote for H(P) is for the entropy of an "ensemble" (or distribution), e.g. the entropy of a randomly selected letter in English text.

Reference: "Information Theory, Inference and Learning Algorithms" by MacKay (which is available for free download) http://www.inference.phy.cam.ac.uk/itila/

What is the Shannon Entropy of the String 'QXZ'?

Thread 'Onto set mapping is the surjective set mapping, and into injective?'

Thread 'Here's a Statistics problem for game of Polo (or Hockey if you like)'

Thread 'Roulette wheel physics and probability'

Similar threads

Hot Threads

B A Little Probability Puzzle

I Need help solving this Existence Algorithm for truth

A Does this computation satisfy LTL formulas?

A Prove that points which are indistinguishable from 0 exist (using logic)

A Mathematical Connection between Cosmic Expansion and Exponential Growth

Recent Insights

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem

Insights Why Vector Spaces Explain The World: A Historical Perspective