Hoffman coding in 8bit sequence

  • Thread starter Thread starter ydan87
  • Start date Start date
  • Tags Tags
    Coding Sequence
AI Thread Summary
In the discussion, the focus is on proving that Huffman coding is not more efficient than an ordinary 8-bit fixed-length code for a data file containing 8-bit characters with frequencies that are nearly uniform. The key point is that if the maximum character frequency is less than twice the minimum frequency, the distribution of character probabilities remains close to equal. This leads to the conclusion that Huffman coding, which is designed to compress data based on varying character frequencies, may actually result in data expansion when the character frequencies are nearly uniform. Examples illustrate that if most characters have similar probabilities, Huffman coding does not provide significant compression benefits. In scenarios where character frequencies are slightly skewed, such as one character being marginally more frequent than others, the resulting Huffman tree may not yield shorter codes than the fixed-length 8-bit representation. Thus, the discussion emphasizes that under these conditions, using a fixed-length code is as efficient, if not more so, than Huffman coding for data compression.
ydan87
Messages
25
Reaction score
0
A data file contains a sequence of 8-bit characters such that all 256 characters are about as common: the maximum character frequency is less than twice the minimum character frequency. Prove that Huffman coding in this case is not more efficient than using an ordinary 8-bit fixed-length code.

Any help please?
 
Technology news on Phys.org
This isn't very clear: characters are "about as common" with "maximum frequency less than double minimum frequency". This could mean all but two characters have equal likelyhood with probability of 1/256, one of the two characters probability could be (4/3 x 1/256) = 4/768 the other characters probability (2/3 x 1/256) = 2/768. Here it's pretty clear that hoffman coding would result in expansion of data since 254/256 of the time, you don't have hoffman compressable data. On the other extreme, 128 of the characters could be about twice as likely as the remaining 128 characters, but this doesn't match the description "about as common". Still, what would the hoffman coding be for this case?
 
Last edited:
Dear Peeps I have posted a few questions about programing on this sectio of the PF forum. I want to ask you veterans how you folks learn program in assembly and about computer architecture for the x86 family. In addition to finish learning C, I am also reading the book From bits to Gates to C and Beyond. In the book, it uses the mini LC3 assembly language. I also have books on assembly programming and computer architecture. The few famous ones i have are Computer Organization and...
I had a Microsoft Technical interview this past Friday, the question I was asked was this : How do you find the middle value for a dataset that is too big to fit in RAM? I was not able to figure this out during the interview, but I have been look in this all weekend and I read something online that said it can be done at O(N) using something called the counting sort histogram algorithm ( I did not learn that in my advanced data structures and algorithms class). I have watched some youtube...
Back
Top