# B What's the meaning of "random" in Mathematics?

• Featured

#### FactChecker

Gold Member
2018 Award
Another example of a generator that will (eventually, and very very very slowly) not repeat is this one, for say 8-bit integers (or whatever size integers we choose):

Code:
initialize a[i] = 0 for i in 1..N
repeat
calculate randomness of a[]
for i = 1..N, for j = 1 to 8:
invert bit j of a[i]
recalculate randomness of a[]
if new sequence has lower randomness, revert the bit back
until randomness of a[] > desired-target
that algorithm (a brute-force search, actually) will eventually generate a non-repeating, whatever-length sequence that will pass whatever randomness criteria we set... at a cost of taking ages to process... hmm... actually this is probably the slowest random number generator ever... I should get a prize or something for that
It is notoriously difficult to design an algorithm for a random number generator. Anything that one thinks will be random is very likely to display patterns that are clear to see when looked at the right way. Your program looks like an example of that. A classic book which contains a lot on the subject is Knuth, "Art of Computer Programming, Volume 2: Seminumerical Algorithms" .

#### PAllen

It is notoriously difficult to design an algorithm for a random number generator. Anything that one thinks will be random is very likely to display patterns that are clear to see when looked at the right way. Your program looks like an example of that. A classic book which contains a lot on the subject is Knuth, "Art of Computer Programming, Volume 2: Seminumerical Algorithms" .
As an example of the notorious difficulty, all of the touted best algorithms in the 1981 edition of this book are now considered inferior, obsolete algorithms as randomness testing improved, and uses became more sensitive to multidimensional correlations (which plagued the most touted algorithms of the 1981 edition). I do not know if there is a more updated edition; the 1981 is the only one I have.

However I still strongly recommend it as the only book I know of to develop a robust theory of pseudo-randomness in a sequence.

#### FactChecker

Gold Member
2018 Award
There are entire texts now on random number generators that, I assume, are more modern and comprehensive. Unfortunately, I am not up to date on any new developments.

#### Auto-Didact

I can, tho, think of genuinely random variables when, e.g., using a pendulum oscillating between ( more than 1) magnets and seeing where it settles.
This screams determinstic chaos to me i.e. completely non-random, but merely unpredictable for all practical purpose (FAPP) due to limited measurement precision, firstly because of practical reasons and secondly because of the uncertainty principle in QM.

Another popular example of deterministic chaos in which the pattern seems random is the Lorenz water wheel, where it quickly becomes impossible to predict which side the wheel will turn or how it began turning, yet its complete dynamics is fully described by an attractor in state space.

#### FactChecker

Gold Member
2018 Award
As an example of the notorious difficulty, all of the touted best algorithms in the 1981 edition of this book are now considered inferior, obsolete algorithms as randomness testing improved, and uses became more sensitive to multidimensional correlations (which plagued the most touted algorithms of the 1981 edition). I do not know if there is a more updated edition; the 1981 is the only one I have.

However I still strongly recommend it as the only book I know of to develop a robust theory of pseudo-randomness in a sequence.
The algorithms of that time would show high autocorrelations at particular high lag amounts. The use of Box-Jenkins time series analysis implemented on computers easily identified weaknesses that would otherwise not be noticed. Those algorithms were perfectly adequate for most uses in simulation, but not if very sophisticated statistical analysis of the results was to be done. I felt that the weaknesses were noticed in the academic world but usually did not matter in real-world applications.

#### PAllen

The algorithms of that time would show high autocorrelations at particular high lag amounts. The use of Box-Jenkins time series analysis implemented on computers easily identified weaknesses that would otherwise not be noticed. Those algorithms were perfectly adequate for most uses in simulation, but not if very sophisticated statistical analysis of the results was to be done. I felt that the weaknesses were noticed in the academic world but usually did not matter in real-world applications.
For most applications, yes. However a former physics professor of mine bumped hard into the fundamental limits of linear congruential gemerators doing Monte Carlo simulation in high dimensional phase space.

#### jedishrfu

Mentor
Years ago when I was a newbie programmer, a senior programmer told me the story of one such pseudo random generator program.

It was a new and better algorithm that looked totally random according to initial tests.

However, it wasn't until the random sequence elements were grouped into triplets and plotted in 3D that it was discovered that they all fell on the same plane.

#### StatGuy2000

I'm surprised that no one on this forum mentioned that there are in fact different definitions of randomness depending on what context you are using.

For example, there is a notion of statistical randomness. A numeric sequence is said to be "statistically random" when it contains no recognizable patterns or regularities. Note that sequences that are statistically random doesn't necessarily imply "true" randomness in the sense of objective unpredictability, since a deterministic process (such as the calculation of the digits of π) can generate such a statistically random sequence (thus the utility of pseudorandom number generators).

There is another way in which randomness is defined in mathematics. The field of algorithmic information theory studies, among various topics, what constitutes a random sequence. The central idea is that a string of bits is random if and only if it is shorter than any computer program that can produce that string (called Kolmogorov randomness, after Russian mathematician Andrey Kolmogorov who had developed this definition, which were also independently developed by American computer scientist Ray Solomonoff and American mathematician Gregory Chaitin).

Last edited:

#### fbs7

Wow!!!! That is awesome!!! Information theory at its finest!

If you cannot compress a string (by calculating it with a program) then the string is random. Mind-blowing!!! So "abababababababababababababab" is not random, and neither is "abcdefghijklmnopqrstuvwxyz". I'm starting to like this Kolmogorov guy a little more!

Or not... hmm... say I give this string "azi843ad8$@$%po#$cd2904". Then if I could write a tiny program that calculated it, then I know it's not random. But if I can't write that program, how could I prove it's a random string by proving that nobody ever anywhere in the history of the Universe could write a small program that by doing adds/multiplications/etc... would calculate that string. Not liking Kolmogorov any more... he seems to have the habit of frying other people's brains! #### FactChecker Science Advisor Gold Member 2018 Award Wow!!!! That is awesome!!! Information theory at its finest! If you cannot compress a string (by calculating it with a program) then the string is random. Mind-blowing!!! So "abababababababababababababab" is not random, and neither is "abcdefghijklmnopqrstuvwxyz". I'm starting to like this Kolmogorov guy a little more! Or not... hmm... say I give this string "azi843ad8$@$%po#$cd2904". Then if I could write a tiny program that calculated it, then I know it's not random. But if I can't write that program, how could I prove it's a random string by proving that nobody ever anywhere in the history of the Universe could write a small program that by doing adds/multiplications/etc... would calculate that string. Not liking Kolmogorov any more... he seems to have the habit of frying other people's brains!
It seems that the rule of the program must be larger than the result works well as a definition, but proving such a thing for a given sequence is hard.

#### fbs7

Also how to prove an infinite string is random through that rule? It seems to come down to computability... if that string can be calculated through a program, then (I suspect programs cannot be infinitely big) then it won't be random...

Hmm... lots of room for exoteric thought, I suspect... say a sequence is defined through a formula, but that formula cannot be exactly calculated through a program, but only approximately... say the irrational roots of a generic 5th-degree polynomial; I can't write a program that computes that root exactly, so would that mean that the digits of that root are random? Ay-ay-ay... I'm liking Kolmogorov less and less every day!

#### SSequence

Hmm... lots of room for exoteric thought, I suspect... say a sequence is defined through a formula, but that formula cannot be exactly calculated through a program, but only approximately... say the irrational roots of a generic 5th-degree polynomial; I can't write a program that computes that root exactly, so would that mean that the digits of that root are random?
I think 5-th degree polynomials would fall into category of algebraic numbers (at least if the coefficients are rational) and hence into domain of computable numbers (assuming the "decimal expansion" def. here). I suppose one could use any "approximation methods" of solution (that are in fact usually implemented on actual computers I think) that can calculate the decimal positions to arbitrary high accuracy.

Last edited:

#### jbriggs444

Also how to prove an infinite string is random through that rule? It seems to come down to computability... if that string can be calculated through a program, then (I suspect programs cannot be infinitely big) then it won't be random...
You would fix the program (for instance, use a particular Universal Turing Machine) and cast the problem in terms of how much input would be required to produce a given amount of output. Then take the limit of the ratio of the input size to the output size as output size increases without bound.

If the output string is chosen from a uniform distribution (i.e. all prefixes of a given length are equiprobable) then one has all of the tools in Shannon's bag to demonstrate that the input size must be, on average, at least as large as the output size.

If one has an infinite string produced from a loaded die then, with probability 1, the limiting ratio of input size to output size will reflect the Shannon information content of the string.

Last edited:

#### StatGuy2000

Also how to prove an infinite string is random through that rule? It seems to come down to computability... if that string can be calculated through a program, then (I suspect programs cannot be infinitely big) then it won't be random...

Hmm... lots of room for exoteric thought, I suspect... say a sequence is defined through a formula, but that formula cannot be exactly calculated through a program, but only approximately... say the irrational roots of a generic 5th-degree polynomial; I can't write a program that computes that root exactly, so would that mean that the digits of that root are random? Ay-ay-ay... I'm liking Kolmogorov less and less every day!
@jbriggs444 has suggested one way to prove an infinite string is random in post #74. Another way, courtesy of Swedish mathematician Per Martin-Lof, who had proposed the following defintion:

An infinite sequence is random if and only if it withstands all recursively enumerable null sets.

If you want more technical definition, please see the following Wikipedia article:

https://en.wikipedia.org/wiki/Algorithmically_random_sequence#Three_equivalent_definitions

#### SSequence

You would fix the program (for instance, use a particular Universal Turing Machine) and cast the problem in terms of how much input would be required to produce a given amount of output. Then take the limit of the ratio of the input size to the output size as output size increases without bound.
Yes I think a definition based on incompressibility (for a sequence) seems to be an interesting one. Though I am a bit fuzzy about translation of terms between numbers and strings (also because I haven't really studied computability based randomness) ...... since almost always people seem to cast information complexity in terms of strings, but it seems that it could also be re-cast in terms of numbers (with small modifications).

Something else that seems interesting (unless I am missing something due to unfamiliarity with topic at hand) to is that, if we are talking about numbers, apparently an appropriate version of BB function will be compressible (rather in a pronounced way).

P.S.
What is interesting is that what you wrote was essentially what I cast as question in the other thread (with the necessary modification, due to nature of construction, of replacing universal program with ordinal universal program). I should have seen the problem anyway (even without information complexity) because of the reason I mentioned in that thread (since application of such method to a very small ordinal like $\omega_{CK}$ should be a sure sign of a problem with a construction).

#### Mark44

Mentor
Though I am a bit fuzzy about translation of terms between numbers and strings
The strings that are mentioned by most of the people responding are strings of bits -- 0s and 1s.

#### SSequence

Yes, I understand this. In the same way basically we could take any "positive number" to be a regular expression of form:
(1+2+3+4+5+6+7+8+9)(0+1+2+3+4+5+6+7+8+9)*
where the alphabet is obviously the set {0,1,2,3,4,5,6,7,8,9}.

What I meant was the usage of same terms (compressibility, kolmogorov complexity) using different computational models (normally ones that are assumed are ones manipulating strings) that involve "strings" OR "numbers". Since I haven't studied this topic, I was just saying I am unsure how these (presumably robust concepts) would translate between different languages (that use strings OR numbers so to speak).

P.S. To be a bit more concrete. Imagine the different between "concatenation" and "addition by 1". Both could be seen as operation on strings of course. But still "addition by 1" would be a bit complicated when translated in purely string manipulating language.

Last edited:

#### fbs7

I think 5-th degree polynomials would fall into category of algebraic numbers (at least if the coefficients are rational) and hence into domain of computable numbers (assuming the "decimal expansion" def. here). I suppose one could use any "approximation methods" of solution (that are in fact usually implemented on actual computers I think) that can calculate the decimal positions to arbitrary high accuracy.
Correct, so the approximation method can calculate any digit of a root of a polynomial, and the program that calculates that is indeed limited in the size of the code, but it will be unbound in memory, because the numbers involved in the calculations become bigger and bigger. So, if we want to calculate a root of a generic polynomial with 10100 digits we'd need to use calculations involving several numbers with 10100 digits, and the memory for such computer would bigger than the Universe (and the variables involved would be bigger than the output string).

I have no idea if the Kolmogorov size criteria thingie includes both code and memory, but if it involves memory then such calculation would seem suspiciously close to meeting Kolmogorov's criteria for accepting as random something that we know is deterministic.

And I don't even want to start wondering about programs that we don't know if they stop or not. Say that we write a program that calculates X and then prints L = "Hilbert Conjecture is " + X. We have no idea if that will print "is true" or "is false". So, I know that the string "Bananas are yellow" is not random, but we don't know the value of the sequence L, and we don't know if we can write a program (whatever the size) that calculates that. So if we can't write that program, then that string is random per Kolmogorov - but then if we ever find a way to calculate that, then it won't be random anymore, meanwhile it seems absurd to say the value of a mathematical hypothesis is random whenever we don't know. Holy moshes, this gives me headaches - darn you Kolmogorov, taking away my sleep!!!

#### SSequence

Correct, so the approximation method can calculate any digit of a root of a polynomial, and the program that calculates that is indeed limited in the size of the code, but it will be unbound in memory, because the numbers involved in the calculations become bigger and bigger. So, if we want to calculate a root of a generic polynomial with 10100 digits we'd need to use calculations involving several numbers with 10100 digits, and the memory for such computer would bigger than the Universe (and the variables involved would be bigger than the output string).
The abstract definition of a program always means "infinite memory". All "real world" programs are of course really finite automatons in a sense (unless we somehow assume ability to attach bigger and bigger memories as required). But usually thinking about them in a abstract way (like an idealised program) is often quite helpful.

#### Mark44

Mentor
Yes, I understand this. In the same way basically we could take any "positive number" to be a regular expression of form:
(1+2+3+4+5+6+7+8+9)(0+1+2+3+4+5+6+7+8+9)*
where the alphabet is obviously the set {0,1,2,3,4,5,6,7,8,9}.
We could, but we don't need to. Any positive number, or for that matter, any negative number or floating point number can be represented by a string of 0s or 1s. At heart, any data in a computer is composed of binary strings.

"What's the meaning of "random" in Mathematics?"

### Physics Forums Values

We Value Quality
• Topics based on mainstream science
• Proper English grammar and spelling
We Value Civility
• Positive and compassionate attitudes
• Patience while debating
We Value Productivity
• Disciplined to remain on-topic
• Recognition of own weaknesses
• Solo and co-op problem solving