Definition of randomness (sidebar: truth is random)

0rthodontist
I think that's a reasonable definition of compressability for infinite strings--meaning, "infinitely compressible," or compressible to a finite amount of information. It doesn't work in other cases. For example, let j be an infinite bit string that anyone would agree is totally random (like your bit string from atom decay). Let k be an infinite binary string formed by replacing each 0 in j with 000, and replacing each 1 in j by 111. No program of finite length can print k. But k does not seem as random as j--it seems only about one third as random.

Another definition I can think of is that an infinite string is compressible if every finite prefix above a certain minimum length is compressible. Even though the summary of an infinite string might also be infinite, you can read the first n digits of the summary and derive more than the first n digits of the infinite string. So one compression of the first 3 * n digits of k is the first n digits of j, and reading the first n digits of the summary allows you to extrapolate the first 3 * n digits of the original string k, so k is "compressible" and not fully random.

I think that a compression scheme of randomness needs to take into account three things when deciding the randomness of some given data:
1. The data itself
2. Any context for the data (as in the dice example, where the context is the starting velocities and physical properties of the dice). Data + context = the "real" data set that the machine will try to compress, and statements about randomness can be made about data + context considered together, not about data alone.
3. The machine itself. What one compression machine is able to compress is not necessarily compressible by another compression machine. For example, let's say we have two compression algorithms, one designed to run on images and the other designed to run on English text. Data for an image seems mostly random to the English text compressor, and data for English text seems mostly random to the image compressor. I don't think this can be turned into a question of "does there exist a machine that can compress this" because the answer for finite data is always yes, there is always a machine that can compress all the data into one bit as you have mentioned earlier. I also don't think it can be profitably turned into a question of the existence of a machine to distinguish between many different data sets of the same type, because of the question of how you define what that type is. It has to be relative to the machine. Making it relative to the type that the data sets are considered part of just exchanges one relativeness for another, and it seems more natural (to me) to talk about the machine.

Statements about randomness of data are then really statements about randomness of (data + context) for a particular machine.

Furthermore, I don't think your definition is adequate. Maybe it's impossible to summarize that data in shorter form if you live in a vacuum... but it might become possible if you're allowed to refer to the initial conditions. If so, then I don't think the data could qualify as being "random".
I'm not sure what data you are talking about--the dice? Anyway, I may have addressed this point depending on what you mean.

Hurkyl
Staff Emeritus
Gold Member
Hrm, that suggests two ideas to me.

The first would be some sort of asymptotic descriptive complexity -- you would define the asymptotic descriptive complexity of an infinite string to be the sequence whose n-th term is the descriptive complexity of its n-long initial segment.

Then, if the original sequence had complexity ~ f(n)
Let k be an infinite binary string formed by replacing each 0 in j with 000, and replacing each 1 in j by 111.
this sequence ought to have asymptotic descriptive complexity ~ f(n)/3.

I suspect this is immune to the usual ambiguity of descriptive complexity for finite strings.

3. The machine itself. What one compression machine is able to compress is not necessarily compressible by another compression machine. For example, let's say we have two compression algorithms, one designed to run on images and the other designed to run on English text.
I don't think we have to worry about that one. For finite strings, we know the "best" compression scheme: you encode your string s as <M, w> where M is a turing machine, and M(w) = s. The decompressor is simply a universal turing machine that reads <M, w> and runs M on w.

Any other scheme (for which decompression is computable) can only ever be better by some fixed additive constant... but fixed additive constants are irrelevant for our purposes!

The idea of the proof is that running your decompressor M on the compressed data w is the same as running the universal turing machine on <M, w>.

The other idea is to define a pre-ordering on strings. We define s <= t if and only if there exists a turing machine M that can print the first n symbols of s given the first n symbols of t as input. (In other words, there is enough information in the first n symbols of t to compute the first n symbols of s) Note that the same machine has to work for all n. Alternatively, s <= t "means" that t is "more random" than s. I wonder if this pre-order has any maximal elements? I wonder under what conditions two strings can compare equal!

0rthodontist
Hurkyl said:
Hrm, that suggests two ideas to me.

The first would be some sort of asymptotic descriptive complexity -- you would define the asymptotic descriptive complexity of an infinite string to be the sequence whose n-th term is the descriptive complexity of its n-long initial segment.

Then, if the original sequence had complexity ~ f(n)

this sequence ought to have asymptotic descriptive complexity ~ f(n)/3.

I suspect this is immune to the usual ambiguity of descriptive complexity for finite strings.
I don't know about descriptive complexity. I was thinking more about compression, the ratio of the compressed form of the string to the length of the string, generalized to the asymptotic case. For j that ratio would be 1, for k that ratio would be 1/3.

I don't think we have to worry about that one. For finite strings, we know the "best" compression scheme: you encode your string s as <M, w> where M is a turing machine, and M(w) = s. The decompressor is simply a universal turing machine that reads <M, w> and runs M on w.

Any other scheme (for which decompression is computable) can only ever be better by some fixed additive constant... but fixed additive constants are irrelevant for our purposes!
For infinite strings a fixed constant (I assume you mean the length of M) is irrelevant, but I am also concerned with finite strings where a difference of 1000 bits for your encoder could mean the difference between compression and no compression (or "anti compression"). So for that reason it makes more sense to me to take M as a premise and then say the encoding is w. Because the length of M really should not matter for the randomness of the string. Besides, might not an unconventional program have infinite size?

The idea of the proof is that running your decompressor M on the compressed data w is the same as running the universal turing machine on <M, w>.

The other idea is to define a pre-ordering on strings. We define s <= t if and only if there exists a turing machine M that can print the first n symbols of s given the first n symbols of t as input. (In other words, there is enough information in the first n symbols of t to compute the first n symbols of s) Note that the same machine has to work for all n. Alternatively, s <= t "means" that t is "more random" than s. I wonder if this pre-order has any maximal elements? I wonder under what conditions two strings can compare equal!
I don't think that is the best ordering because let s be a string like j (totally random) and let t be s except that every block of 2 bits is reversed. So if s started (spaces used only for clarity)
01 01 01 01 11 01 00 10 01 then t would start
10 10 10 10 11 10 00 01 10

I think these strings should be considered to have equal randomness. But given any odd number of bits n in s, you can only compute the first n - 1 bits in t, and vice versa. I think a better definition is that s <= t iff the first n bits of t allow you to compute the first n - k bits of s, where k is some fixed constant and n is sufficiently large.

Though also, here's one to ponder. Let s be as before and let t be the string produced by first reversing the first two bits of s, then reversing the next 6 bits of s and appending them to the first two, then reversing the next 18 bits of s and appending them to the first 8, and so on, always multiplying by 3 the number of bits to reverse. Then it is not true that s<=t or t<=s even by my definition, yet they both seem equally random. And let u be the string produced by replacing every 0 in t by 00, and every 1 in t by 11. Then it is not true that s<=u or u<=s even by my definition, though obviously u is less random than s. So maybe we need another definition of <=.

You could say that t <= s iff comp(t) <= comp(s) where comp is the compression ratio. That is, let w(t, n) be the shortest encoding of the first n bits of t on a universal Turing machine. If lim(n->inf) (w(t, n)/n) exists and equals k, then comp(t) = k.

I guess I agree that for infinite strings it's better to ignore the machine and consider it part of the encoding so long as the machine is finite, or in other words to consider randomness only with respect to a universal machine.

Intuitively j would be a maximal element but maybe it is not possible for a string like j to exist in mathematics. In fact it seems like perhaps all infinite strings that can be defined are infinitely compressible. Letting j have noncomputable bits does not strike me as a reasonable approach. Noncomputable bits should be excluded from strings under consideration.

Anyway I am really more concerned with finite strings, since all data in the real world is finite.

But for infinite strings, here is how I imagine an infinite random string. A box is discovered with two lights on it, red and green. All attempts to break into the box fail; it is impervious to everything. The lights alternately flash, but no pattern can be detected in which light flashes at a given time. Every computer designed to compress any finite substring of the flashing lights pattern finds that for every substring other than the one it was designed for, its compression is on average nil. The nth light in the flashing lights sequence can be determined either by consulting the records or waiting for flash #n, but no computer without access to either the future or the past that contains that particular flash does better than chance at predicting what it is. And the same goes for any substring of the flashing lights sequence--every finite substring either is already known or can be known by waiting a while, but without access to that substring nothing can do better than chance at predicting what it is.

This box does not seem compatible with noncomputable bits, because each future bit _can be known_ just by waiting long enough, it just _can't be computed_ before then. Randomness must be knowable, but not computable.

Last edited:
0rthodontist
Well, that looks relevant but unfortunately I'm still clueless with that type of math. Maybe once I finish my course on language theory this semester I'll be able to discuss that more intelligently.

Here's a definition that I think captures your earlier idea of relative randomness, without having to compare compression ratios. (by the way I was extensively editing my post last night so I may have added stuff after you last read it--the last edit was 10:00 PM US Eastern time)

Given two infinite strings s and t, s <= t iff there exists some infinite sequence K = {n1, n2, ... } of natural numbers and a constant C, such if n $$\in$$ K then given the prefix of t of length n, it is possible to compute n - C elements of s along with their positions in s, though these elements need not be consecutive.

By this definition, I can't think of any examples where two strings s, t are derived from each other with apparent "equal randomness" without having s <= t and t <= s.

Last edited:
Hurkyl
Staff Emeritus
Gold Member
Given two infinite strings s and t, s <= t iff there exists some infinite sequence K = {n1, n2, ... } of natural numbers and a constant C, such if n in K then given the prefix of t of length n, it is possible to compute n - C elements of s along with their positions in s, though these elements need not be consecutive.
There's a problem with this one, though.

Let s = {0, 0, 0, 0, ...} and t = {0, r1, 0, r2, 0, r3, ... } where the ri are a "random" sequence.

s and t would compare equal according to this definition, since I can always compute n bit & position pairs of one string given the first n of the other.

What brought this on was another thread on another board. I'm looking for a definition of random I can actually apply to individual sequences. He's arguing that the actual definition of random is "unpredictable" or "uncaused" and that the two are equivalent. I pointed out that you can have unpredictable with cause, so they're not equivalent. Uncaused, I went on to say, is hard (for me at least) to define mathematically so I will stick to "unpredictable." Predictable to me means compressibility in your Kolmogorov sense. What I'm ultimately trying to figure out is whether quantum events are truly random, and maybe uncaused. To get anywhere I have to have a definition of random. I'm trying to decide if "God plays dice." Of course one wonders where the "dice" (i.e., quarks and leptons and such) came from...

Hurkyl
Staff Emeritus
Gold Member
There's an entirely different approach you can take to quantum randomness -- take modern probability theory seriously. Stop trying to think of experiments having outcomes: they're merely random variables. The dice never actually took on a value, although we would have facts such as

P((I saw a 1) or (I saw a 2) or ... or (I saw a 6)) = 1

and

P(you saw a 3 | I saw a 3) = 1

This is related to the many worlds interpretation (MWI), but I get the impression that the two aren't synonymous.

0rthodontist
Given two infinite strings s and t, s <= t iff there exists some infinite sequence K = {n1, n2, ... } of natural numbers and a constant C, such if n $$\in$$ K then given the prefix of t of length n, it is possible to compute n - C elements of s along with their positions in s, though these elements need not be consecutive.
Hurkyl said:
There's a problem with this one, though.

Let s = {0, 0, 0, 0, ...} and t = {0, r1, 0, r2, 0, r3, ... } where the ri are a "random" sequence.

s and t would compare equal according to this definition, since I can always compute n bit & position pairs of one string given the first n of the other.
You're right... it can be patched by declaring the additional condition that for a given m, it is possible to choose an n large enough from K so that you can compute the first m digits of s given the first n digits of k. But I have no good reason for making that change.

The situation I made that definition to try to deal with was this: let s = {r1, r2, r3, ...} and let u be defined by {r1, r2, r4, r8, r16, ...}. Let v = the sequence of bits in s other than those in u. Now define t by the following:
i is even => ti = u(i/2) (edit from ui)
i is odd => ti = v((i+1)/2) (edit from vi)

t and s, it seems to me, should compare equal. But how to define it so it works that way?

Here is a definition of a totally random string. s is a totally random string iff for any n there exists a Turing machine M with an infinite output tape that can compute the first n bits of s on input 0, but for any such machine M there exists a number m (and m > n) such that the m'th letter on the output tape of M on input 0 does not agree with the m'th letter in s.

Last edited:
Hurkyl
Staff Emeritus
Gold Member
Maybe they shouldn't compare equal! You've rearranged the terms, and maybe that rearrangement conveys some information? (or destroys some?)

Maybe it's similar to the problem of the information in concatenated strings: for example, for any scheme K and integer c, there exists (finite) strings x and y such that K(xy) > K(x) + K(y) + c.

Maybe we should try postulating the properties we would like our notion of complexity to have, and then trying to work out if there's any way to attain it!

0rthodontist
Ah, I actually didn't say what I meant there--I edited so it now reads how I intended it.

I don't think the rearrangement conveys or destroys any information because it doesn't depend on the bits in s--if I rearranged the terms to, for example, order them, then I would be reducing the amount of information, but since the rearrangement here is fixed from the start and you CAN directly derive n bits in s from n bits in t and vice versa, I think they should be looked at as the same. The bits are totally random, and s and t both have compression ratios 1.

Hurkyl
Staff Emeritus
Gold Member
Here is a definition of a totally random string. s is a totally random string iff for any n there exists a Turing machine M with an infinite output tape that can compute the first n bits of s on input 0, but for any such machine M there exists a number m (and m > n) such that the m'th letter on the output tape of M on input 0 does not agree with the m'th letter in s.
This is equivalent to computability!

Theorem: the following are equivalent:
(1) s is not a totally random string.
(2) There exists a turing machine M with an auxiliary output tape that correctly prints the letters of s in order. (on any input)
(3) There exists a turing machine M such that M(n) is the n-th letter of s.

I imagine you're right about rearrangement -- as long as the rearrangement is computable. But to be honest, I would have thought K(xy) = O(K(x) + K(y)) at first too.

Last edited:
0rthodontist
(by the way after thinking about this for a bit I think that a better name for what I defined is a "random" rather than "totally random" string because the compression ratio for what I defined can be less than 1).

No, randomness as I defined it is not equivalent to noncomputability, though. Let t be an infinite binary string and let s = kt, where k is the number, printed in binary, that is the longest string that a terminating turing machine of size 20 can print. (did I get that right? I mean for k to be a finite noncomputable number). Then s is noncomputable but it is also not random as I defined it because there does not exist any machine that can compute the first |k| symbols of s.

Last edited:
Hurkyl
Staff Emeritus
Gold Member
There is no such thing as a noncomputable integer. For every integer n, there exists a turing machine M such that M(w) = n for any input w.

It's the function that's noncomputable: there does not exist a turing machine M such that, for every integer n:

M(n) = the longest string that a terminating turing machine of size n can print

Last edited:
0rthodontist
Is there already a formal concept of a "knowable" infinite string, by which I mean a string for which any prefix can be listed explicitly and a proof exists that the listing is the prefix? I guess that such an idea would both have to refer to definitions of strings, rather than strings themselves.

Last edited:
AKG
Homework Helper
One thing worth noting is that you've not really shown that the set of tautologies is undefineable. You've given a 1-1 correspondence between tautologies and a set that is undefinable. Actually, I'm not sure that you've shown this set to be undefinable, you just say that it is Tarski's theorem. What exactly does Tarski's theorem say, and what more do you say? Note that since there exists an undefinable subset of the naturals, and since any finite set is definable, there exists an uncountable undefinable subset of the naturals. The naturals themselves form a definable uncountable subset. So, the naturals can be put into a 1-1 correspondence with some undefinable set, but this says nothing about the definability of the naturals.

Also, you don't want something like f(n) = 1. A set is definable if there is some wff f(n) of one free variable (n) such that f(a) holds true in the standard model iff a is in the set. So the set {1} can be defined by the wff n=1, or more precisely, n=S0, where S is the successor function in the standard model.

0rthodontist