Efficient encoding

  • Thread starter lordy12
  • Start date
36
0

Main Question or Discussion Point

Say you are given a five-sided biased die that has a probability of 1/8 of coming up A, 1/8 for B, and 1/4 for each of C, D, and E. Design an optimal code for transmitting throws of this die.

log(5) = 2.32, so 2.32 bits are required to transfer one throw of the die. But if we encode as follows:

A: 000
B: 001
C: 01
D: 10
E: 11

Then the average number of bits for each throw is 2.25. My question is, can the 1's and 0's be placed anywhere so long as the number of numbers is the same? For example:

A:111
B:000
C:11
D:11
E:11

This representation still has the same number of bits for each letter as the previous representation. Am I wrong? Is this still a valid solution
 

Answers and Replies

695
2
The latter cannot be a valid encoding: if you see "11", you won't be able to discern if the case is C, D or E.
 
D H
Staff Emeritus
Science Advisor
Insights Author
15,329
681
log(5) = 2.32, so 2.32 bits are required to transfer one throw of the die.
That applies only for equiprobable events. With the first encoding, the expected number of bits to transmit a throw of the die is 2.4 if all events have probability 1/5.

My question is, can the 1's and 0's be placed anywhere so long as the number of numbers is the same?
Of course not. Your own example disproves this. So does this one:

A:000
B:001
C:00
D:01
E:11

With this scheme one cannot unambiguously transmit a sequence of dice throws. Is 000000 two throws of A or three throws of C?
 
quadraphonics
Then the average number of bits for each throw is 2.25. My question is, can the 1's and 0's be placed anywhere so long as the number of numbers is the same? For example:

A:111
B:000
C:11
D:11
E:11

This representation still has the same number of bits for each letter as the previous representation. Am I wrong? Is this still a valid solution
You will typically want what's called a "Prefix Code" for this type of problem. See http://en.wikipedia.org/wiki/Prefix_code . The idea with a prefix code is that no codeword is the "prefix" of any other; i.e., there is no codeword that is formed by adding digits to another codeword. This allows you to send strings of codewords and be able to parse them correctly. Consider this example:

A: 0
B: 10
C: 11

Which is a prefix code. Then, the string {A,B,C} becomes 01011, which can be uniquely decoded. By comparison, consider a non-prefix code:

A: 1
B: 11
C: 111

Then, {A,B,C} would be encoded as 111111, but when decoding, we can't tell whether this should be {A,A,A,A,A,A}, {A,B,A,B}, {B,B,B}, {C,C}, {A,B,C}, {C,B,A}, etc. So it's not terribly useful as a code.
 

Related Threads for: Efficient encoding

  • Last Post
Replies
5
Views
2K
  • Last Post
Replies
3
Views
1K
  • Last Post
Replies
10
Views
2K
  • Last Post
Replies
2
Views
2K
Replies
8
Views
2K
Replies
5
Views
1K
Replies
20
Views
1K
Top