Limit of compression (source coding theorem)

gop · Apr 4, 2009

Hi

The source coding theorem says that one needs at least N*H bits to encode a message of length N and entropy H. This supposedly is the theoretical limit of data compression.

But is it? Or does it only apply to situations where only the frequency (probability) of a given symbol is known?

For example, If I know that the symbol x is always followed by the symbol y (or that this happens with a very high probability) couldn't I use this to construct a compression algorithm that needs fewer bits than N*H?

thx

CRGreathouse · Apr 4, 2009

gop said:

For example, If I know that the symbol x is always followed by the symbol y (or that this happens with a very high probability) couldn't I use this to construct a compression algorithm that needs fewer bits than N*H?

If x is (almost) always followed by y, then that lowers the entropy. It's already taken into account.

gop · Apr 4, 2009

But if I have something like this

p(x,y,z)= (1/3,1/3,1/3)

then I have entropy 1.585. But now I could have p(y|x) = 1/2 or p(y|x) = 1 as long as p(y|z)=1-p(y|x) the overall probability p(x,y,z) stays the same. So I have the same entropy. But in the case of p(y|x)=1 I can only use two symbols say p(xy,z) = (1/2,1/2) which has entropy 1.

I guess I'm missing something obvious here but...

thx

CRGreathouse · Apr 4, 2009

gop said:

I guess I'm missing something obvious here but...

Yes, your formula doesn't apply unless the choices are made independently.

gop · Apr 5, 2009

ok I got it if I use conditional probabilities I need to use another model and another way to compute the entropy rate.

thx

Limit of compression (source coding theorem)

Thread 'Non-orthogonal bases'

Thread 'Fixing Things Which Can Go Wrong With Complex Numbers'

Thread 'What Exactly is Dirac’s Delta Function? - Insight'

Similar threads

Hot Threads

Insights Fermat's Last Theorem

B What could prove this wrong? I'm having a dispute with friends

B About a definition: What is the number of terms of a polynomial P(x)?

B How Many Straight Lines to Connect an N by M Array of Points in a Closed Loop?

B Geometry Puzzle with 20 points in a cross pattern

Recent Insights

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers

Insights Fermat's Last Theorem