Are Strings Kolmogorov-Random Actually Random?

AI Thread Summary
The discussion centers on the nature of Kolmogorov-random strings and their relationship to traditional randomness. It argues that while incompressible strings likely won't contain specific patterns, random strings can, suggesting that almost all strings will include any given post when sufficiently long. The concept of Kolmogorov randomness implies incompressibility, contrasting with the idea that a truly random string could be compressible. The conversation also highlights the ambiguity in defining "randomness" and how intuitive notions of probability may not align with formal definitions. Ultimately, the complexities of compression and randomness raise questions about the true nature of Kolmogorov-random strings.
Tiiba
Messages
53
Reaction score
0
Are strings that are Kolmogorov-random actually random in the usual sense?

An incompressible string probably won't contain this post anywhere in it, but a random one can.
 
Physics news on Phys.org
If you have a long string in which your post does not appear, you could compress it (very slightly) by using that fact. If your post has N characters, and the string (and post) are drawn from an alphabet of length A, it could be compressed by (100/A^N)%.* So if your post was "11" out of a binary alphabet, a string not containing it could be compressed by 25% by encoding in trinary: 0 = 00, 1 = 01, 2 = 10.

So I would actually expect your post to appear in a long-enough Kolmogorov-random string.

* Actually, depending on the post, probably a fair bit more, maybe as much as (100N/A^N)%.
 
Suppose your post was 171 characters long out of a 7-bit (128-character) alphabet. Then a Kolmogorov-random string of length 128171 = 2.15... x 10360 would be expected to have one copy of your post. :cool:
 
Tiiba said:
Are strings that are Kolmogorov-random actually random in the usual sense?
What is the "usual sense"? A Kolmogorov-random string certainly isn't a probability measure on a topological space -- and most peoples' intuitive notion of probability is notoriously ill-defined.


An incompressible string probably won't contain this post anywhere in it, but a random one can.
On the contrary, "almost all" strings contain a copy of your post -- therefore there exists a length L such that every incompressible string longer than L contains a copy of your post.
 
Last edited:
By "random in the usual sense" I mean that any substring - including a trillion repetitions of the letter "g" - can appear in the string.

As I understand, Kolmogorov randomness is produced by simply compressing a non-random string. But if I have a random string generator, I could create a program that works as follows:

1 Read a string from the generator
2 Compress the chunk
3 If the chunk gets bigger due to the compression algorithm, place it in the output as is and prepend it with a 0; If it proves compressible, place the compressed version in the output and prepend it with a 1
4 If the amount of space saved through compression exceeds the size of the decompressor, halt.

So a truly random string would be compressible. But a Kolmogorov-random string is incompressible by definition. So is my logic wrong, or did I misunderstand a definition somewhere?
 
Tiiba said:
1 Read a string from the generator
2 Compress the chunk
3 If the chunk gets bigger due to the compression algorithm, place it in the output as is and prepend it with a 0; If it proves compressible, place the compressed version in the output and prepend it with a 1
4 If the amount of space saved through compression exceeds the size of the decompressor, halt.

So a truly random string would be compressible.

I think you'll find that the average of min(size(chunk), size(compress(chunk))) + 1 bit is actually greater than size(chunk) for uniformly random bitstrings and any compression algorithm, even though min(size(chunk), size(compress(chunk))) will of course be no larger than size(chunk).
 
Okay... Then I do this:

1) Place the whole random string in a file F.
2) Define a header of length A that instructs the decompressor to decompress the following string and insert the result at its original position.
3) Cut out all detected compressible chunks from F and place them in another file (as long as the compression saves more space than the header uses).
4) If the amount of space saved through compression exceeds the size of the decompressor, halt.

This way, there is no overhead for the incompressible chunks.
 
Back
Top