Compression Conundrum: Why Does WinRAR Reach 84% at Best?

In summary, a piece of data that is supposed to be random, but is actually not, is not able to be compressed very well using winRAR.
  • #1
martix
169
5
I have discovered something weird that I want someone to explain to me.

The program in question is winRAR, haven't tested others. Here's the thing.

That program at its best is able to compress random data(a piece of some transcendental number) to exactly 84 percent.
However should you choose to compress certain kinds of patterned data(like program executables), rarely does it manage to go lower than 95%.
Why is that?

I'm also not sure where this topic would be more at home - Computer Science or Number Theory.

P.S. I forgot to specify the format of the data - it's in BCD. Don't know how much that is relevant.
 
Technology news on Phys.org
  • #2
martix said:
I have discovered something weird that I want someone to explain to me.

The program in question is winRAR, haven't tested others. Here's the thing.

That program at its best is able to compress random data(a piece of some transcendental number) to exactly 84 percent.
However should you choose to compress certain kinds of patterned data(like program executables), rarely does it manage to go lower than 95%.
Why is that?
I must question how random digits in a transcendental number are. Not only is truly random data not compressible, it will actually expand in WinRAR and any other compression scheme you can think of...

I'm also not sure where this topic would be more at home - Computer Science or Number Theory.
I would say that this is the right forum...

P.S. I forgot to specify the format of the data - it's in BCD. Don't know how much that is relevant.
It's very relevant. Would you be surprised to find that an ASCII string of a number compressed really well? If you read the wikipedia page on BCD, you'll see that it's not a particularly efficient encoding scheme so a compressor will find opportunities to remove redundancies...
 
  • #3
I see...

Well I can see that there are loopholes in my scheme. :)
Did some more experiments. Properly this time. For one, I scrapped BCD. Turn's out it was the culprit. :)
In the mean time, I tried a few other methods:
1. I got a bunch of data(64K) from random.org.
2. A 2^28 dec digit PI(in hex).
3. A 2^24 dec digit CATALAN(in hex).

Turns out none of these compressed to any degree :) All it did was added a few headers, the underlying data remained utterly untouched. I guess it's impossible to say how random 2 or 3 is, but they seem to work in this case just as well.
 
  • #4
If interested, in hex pi is 3.243f6a8885a308d3 ... . Here is a link to an 8MB file starting with the fractional part 243f6a88... (equivalent to about 19.265 million decimal digits).

http://rcgldr.net/misc/pi.bin
 
Last edited by a moderator:
  • #5
Well, you have 19.265 mill digits, I got a 2^28 digits. Do the math :)
 
  • #6
martix said:
Well, you have 19.265 mill digits, I got a 2^28 digits.
Sorry, I miss read your post. Wasn't sure what you meant by "in hex". So yes, a 111.465MB binary string is longer than a 8MB binary string. update - corrected my post.
 
Last edited:
  • #7
It wasn't 32GB, it was 111465411 bytes(or 106.3MB) of binary data. :)
 
  • #8
martix said:
It wasn't 32GB, it was 111465411 bytes(or 106.3MB) of binary data.
Not paying attention, was thinking 2^28 hex digits or 2^32 bits of data (32 giga-bits, not giga-bytes, which would be 536 mega-bytes. As you mentioned 2^28 decimal digits would take 111,465,411 bytes. ((1 / log10(256)) ~= 0.415241011861 bytes per decimal digit).
 

FAQ: Compression Conundrum: Why Does WinRAR Reach 84% at Best?

What is WinRAR?

WinRAR is a file compression program that is used to reduce the size of files and folders, making them easier to store and transfer.

Why does WinRAR only reach 84% compression at best?

This is because file compression algorithms have limitations and can only compress data up to a certain point. After that point, further compression may cause data loss or corruption.

Is WinRAR the best compression software available?

It is one of the most popular and widely used compression programs, but there are other software options available that may offer different features or levels of compression.

What types of files can be compressed using WinRAR?

WinRAR can compress a variety of file types, including documents, images, videos, and audio files. However, some file types may not compress as much as others due to their already compressed nature.

Can compressed files be opened and used without decompressing them?

No, compressed files need to be decompressed before they can be used. WinRAR and other compression programs have the ability to decompress files and restore them to their original size and format.

Back
Top