A question on the interpretation of randomness in algorithmic information theory

In summary: Entropy is a measure of how much disorder or chaos a system exhibits. In the case of information, entropy is a measure of how much information is in a system. So if you compress an encyclopedia with a compression algorithm, you're increasing the entropy of the encyclopedia, but the bits that make up the compressed encyclopedia would still not be random.
  • #1
technodai
2
0
Hi everyone, I'd like to start by saying that I have no background or knowledge of maths other than high school. As a consequence, I have no idea if this question is stupid, trivial, nonsensical or whatever. I also don't even know if it's in the correct thread (I made the best guess I could). So I should probably apologise straight away!

I've been switched unexpectedly back to an interest in maths for reasons difficult to explain. But the nature of random data has arisen and I have been doing some reading. The question below is just one of the many I have on the issue. It's also quite long, so again, I apologise. Thanks for your time.

As I understand it, algorithmic information theory tells us that;

The 3000 page encyclopaedia contains less information than 3000 pages of random letters

This is illustrated by the fact that a description of the string of random characters would need to be at least as long as the string itself (random data cannot be compressed). The encyclopaedia however, as it is composed of words from a language which has its own rules and structure, could be described in less. As an example of one of the ways this might be achieved, we can imagine replacing every instance of the word 'the' with a single symbol, thereby removing the need for two of the three characters required to express the word itself.

This does of course seem counter intuitive to us, since the encyclopaedia contains more 'useful' data. Of course, how 'useful' data is depends upon human values and cannot be quantified objectively here. In addition to this objection though is our normal understanding of what it means for information to be truly random. To be truly random, information must contain no information (?). If this is the case, then we can assume that no matter how we choose to re-arrange the information as whole, it will make no discernible difference to the meaning of the information. It is said to be in a state of maximum entropy – the information is in a state of total disorder.

Black holes are also an example of maximum entropy, since any re-arrangement of the matter behind the event horizon can make absolutely no difference to the appearance of the black hole. In fact, since it is impossible to ever receive information from inside a black hole, we say that the black hole contains no information. A black hole therefore represents the maximum amount of entropy that can exist in space. This is further illustrated by the fact that if we add more entropy to the system, the black hole must get larger.

Now, if we compare this to our above description of the 3000 pages of random words we find a similarity. If we add more random letters, the solution that describes those random letters must increase accordingly. And yet the characters still contain no information! So then, if the shortest possible solution to describe the characters is the characters themselves, then the solution itself contains no information. Would a better description of the random characters in fact be an algorithm that generates a string of random digits, regardless of whether they are the same digits as in the first random string?
 
Physics news on Phys.org
  • #2
Consider a text file containing the encyclopedia. Now compress that file with, e.g., gzip. The resulting file has the whole encyclopedia but in fewer bits, so it has more information per bit. If you look at the compressed bits you'll see that they look essentially random.

For all you know, that large 'random' string is just the compressed form of a larger encyclopedia... right?
 
  • #3
Okay I think I get that so far. However, the bits that make up the compressed encyclopaedia wouldn't actually be random, even if they gave the appearance of being so. Similarly the random string, if it was the compressed form of a larger encyclopedia would therefore cease to be random!

Does this say more about our inability to tell if a string is truly random than it does about whether any random string is equivalent to any other (of the same length)? Could the random string be effectively described by the random string generating algorithm?
 
  • #4
technodai said:
Okay I think I get that so far. However, the bits that make up the compressed encyclopaedia wouldn't actually be random, even if they gave the appearance of being so.

How could you possibly tell? I mean, if you try to unzip it with a gzip decompressor (gunzip) and it fails, you know it's not gzipped. But what if it was CRGreathouse-encoded?

But all this is a bit afield from the original question. What's being measured is entropy, not usefulness, and a random string has maximal entropy. This follows fairly quickly from the definition of information entropy.
 
  • #5


I can understand your curiosity and interest in algorithmic information theory. It is a complex and fascinating field that deals with the fundamental aspects of information and randomness.

To answer your question, it is important to first clarify the concept of randomness in algorithmic information theory. In this theory, randomness does not mean the absence of information, but rather the lack of patterns or regularities in the data. This means that a string of random characters can still contain a lot of information, but it cannot be compressed or described by a shorter algorithm.

In the example of the 3000 page encyclopaedia and 3000 pages of random letters, the encyclopaedia contains more information because it follows a specific structure and rules. This structure allows for a shorter algorithm to describe it, whereas the random letters would require a longer algorithm or simply the characters themselves to be described.

You are correct in stating that the perception of usefulness of information is subjective and cannot be quantified objectively. However, in algorithmic information theory, the focus is on the objective measure of information, not its perceived usefulness.

As for your comparison to black holes and maximum entropy, it is important to note that algorithmic information theory deals with discrete data, whereas the concept of entropy in physics deals with continuous systems. Therefore, the two cannot be directly compared.

In summary, the interpretation of randomness in algorithmic information theory is not based on the absence of information, but rather the lack of patterns or regularities. Random data can still contain a lot of information, but it cannot be compressed or described by a shorter algorithm. I hope this helps clarify your question.
 

1. What is algorithmic information theory?

Algorithmic information theory is a branch of computer science that studies the amount of information in a given object or system. It focuses on the concept of randomness and how it relates to the complexity of algorithms.

2. What is the interpretation of randomness in algorithmic information theory?

The interpretation of randomness in algorithmic information theory is that randomness is a measure of the unpredictability or lack of pattern in a given object or system. It is often quantified using measures such as Kolmogorov complexity or algorithmic entropy.

3. How is randomness related to complexity in algorithmic information theory?

In algorithmic information theory, randomness is closely related to the complexity of objects or systems. A higher level of randomness typically corresponds to a higher level of complexity, as it is more difficult to describe or predict a random object or system using a finite set of instructions or rules.

4. What are some applications of algorithmic information theory?

Algorithmic information theory has applications in various fields such as computer science, mathematics, and physics. It is used in data compression, machine learning, cryptography, and understanding the concept of randomness in natural systems.

5. How is algorithmic information theory different from other theories of randomness?

Algorithmic information theory differs from other theories of randomness in that it focuses on the informational content and complexity of objects or systems, rather than just their probability or statistical properties. It also takes into account the role of algorithms and the ability to compress or describe an object in a finite way.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
495
Replies
16
Views
1K
Replies
12
Views
566
  • Special and General Relativity
Replies
4
Views
1K
  • Quantum Physics
Replies
1
Views
397
  • Quantum Interpretations and Foundations
Replies
2
Views
1K
  • Special and General Relativity
Replies
6
Views
976
  • Quantum Interpretations and Foundations
Replies
25
Views
1K
  • Beyond the Standard Models
Replies
3
Views
2K
Replies
2
Views
686
Back
Top