"Randomness Through Entropy" Paradox

rjbeery · Dec 28, 2015

In Information Theory, entropy is defined as the unpredictability of information content and, as such, the entropy of the output from so-called pseudo random number generators (PRNG) is often measured as a test of their "randomness". An interesting paradox arises with this definition...

Start with a suitable seed, [itex]S_n[/itex], consisting of an array of n previously generated pseudo-random numbers from the function [itex]PRNG(seed)[/itex]. In order to maximize the randomness of [itex]PRNG[/itex] we want [itex]PRNG(S_n)[/itex] to return a result such that the entropy of [itex]S_{n+1}[/itex] is also maximized. Such a result can be distinct, given a suitable seed. However, in an effort to maximize the randomness of [itex]PRNG[/itex] we have now created a process which is deterministic and completely predictable, which directly contradicts the proclaimed unpredictability of information content through high entropy.

Dale · Dec 28, 2015

Hence the P in PRNG.

rjbeery · Dec 28, 2015

Agreed, I just find it philosophically interesting that a "perfect" RNG is not the "best" RNG.

WWGD · Dec 28, 2015

rjbeery said:

In Information Theory, entropy is defined as the unpredictability of information content and, as such, the entropy of the output from so-called pseudo random number generators (PRNG) is often measured as a test of their "randomness". An interesting paradox arises with this definition...

Start with a suitable seed, [itex]S_n[/itex], consisting of an array of n previously generated pseudo-random numbers from the function [itex]PRNG(seed)[/itex]. In order to maximize the randomness of [itex]PRNG[/itex] we want [itex]PRNG(S_n)[/itex] to return a result such that the entropy of [itex]S_{n+1}[/itex] is also maximized. Such a result can be distinct, given a suitable seed. However, in an effort to maximize the randomness of [itex]PRNG[/itex] we have now created a process which is deterministic and completely predictable, which directly contradicts the proclaimed unpredictability of information content through high entropy.

Sorry, I am not clear, what is PRNG(S_n) , what does PRNG do to the sequence S_n?

Dale · Dec 28, 2015

rjbeery said:

Agreed, I just find it philosophically interesting that a "perfect" RNG is not the "best" RNG.

What do you mean? The P stands for "pseudo", not "perfect".

_PJ_ · Dec 29, 2015

Yes, computer algorithms including RNG's by their nature perform a sequence of logical steps (often given an input) to arrive at a given result which can then be output.
This precise orderly sequence necessitates a level of predictability.

Randomnisity is defined as non-predictable.

Therefore, there is no means (excluding perhaps some clever use of qubits?) for a computer to generate random numbers.

This isn't really any paradox, just 'bad' terminology of t the word Random - although that's why there is a prefix of "PSEUDO".
_

PRNGs are, just as any computer models, merely simulations. Whilst they may not actually really be providing random numbers per definition, the results (at least, generally for practical purpose) are suitable and indistinguishable from expected actual random results, therefore, in that respect, they are indeed perfect.

rjbeery · Dec 29, 2015

WWGD said:

Sorry, I am not clear, what is PRNG(S_n) , what does PRNG do to the sequence S_n?

PRNGs in general take an input and return a value; it's just a function whose output is to be used when we want unpredictability. In this case, PRNG is taking some seed consisting of an array of n numbers (e.g. 3, 4, 2, 7, 7, 6...0, 1, 9) and, in order to produce the "theoretically least predictable value possible", chooses the one which would produce the highest entropy for [itex]S_{n+1}[/itex] (e.g. 3, 4, 2, 7, 7, 6...0, 1, 9, 8).

The paradox is that by choosing the theoretically least predictable value (by maximizing the entropy of the resulting array) we have come to a single solution, which makes it perfectly predictable.

rjbeery · Dec 29, 2015

_PJ_ said:

Randomnisity is defined as non-predictable.

I'm not making some profound statement here, it's more of a playful philosophical note. In order to have "randomnisity" you must be able to quantify it, and that is generally done by measuring entropy. By that specific measure, a system which reliably produces the least predictable value possible actually becomes perfectly predictable.

This doesn't really need to apply to PRNGs, it could be any source whatsoever. We notice that some background EM noise is consistently producing data at maximum entropy* and all of a sudden we could predict it perfectly.

* This entire premise is assuming that entropy can be reliably quantified and maximized by a distinct output. I don't know about the former but the latter is not always the case, as there may be multiple outputs resulting in an equal entropy.

Dale · Dec 29, 2015

I see what you are getting at now, but it is impossible by definition. What you are suggesting is a self contradiction.

Suppose we have a 1 bit RNG. Suppose further that P(0)=0.9 and P(1)=0.1. So a 0 provides 0.1 nats of entropy and a 1 provides 2.3 nats of entropy. So your proposal would be to always return a 1, but this contradicts the assumption that P(1)=0.1.

Consider an average 10 digit message with this RNG. The expected entropy would be 3.2 nats. In contrast, an unbiased RNG provides 0.69 nats for either a 0 or a 1 so the expected entropy for a 10 digit message is about 6.9 nats.

So the total entropy is maximized only when all outcomes are equally likely and therefore there is no entropy maximizing outcome.

rjbeery · Dec 29, 2015

DaleSpam said:

I see what you are getting at now, but it is impossible by definition. What you are suggesting is a self contradiction.
Suppose we have a 1 bit RNG. Suppose further that P(0)=0.9 and P(1)=0.1. So a 0 provides 0.1 nats of entropy and a 1 provides 2.3 nats of entropy. So your proposal would be to always return a 1, but this contradicts the assumption that P(1)=0.1.

But you're only measuring the entropy of a single result, not the seed plus the new result. I'm not familiar with how to measure the entropy of data from a biased source but I would imagine that the entropy of [itex]S_{n+1}[/itex] = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] would be higher than [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] in your example.

DaleSpam said:

Consider an average 10 digit message with this RNG. The expected entropy would be 3.2 nats. In contrast, an unbiased RNG provides 0.69 nats for either a 0 or a 1 so the expected entropy for a 10 digit message is about 6.9 nats.

So the total entropy is maximized only when all outcomes are equally likely and therefore there is no entropy maximizing outcome.

Agreed on this point, but you specifically limited it to a 10 digit message which is why I mentioned a "sufficient seed". Imagine a million digit message in which 8's are grossly underrepresented.

rjbeery · Dec 29, 2015

After typing that it occurred to me that the proposed RNG would be predictable right up to the point that the "most recent seed's entropy" could no longer be increased (such as in your example). This is probably actually proving the link between entropy and unpredictability but I'm currently too hungry to think about it, time for lunch.

Dale · Dec 29, 2015

rjbeery said:

But you're only measuring the entropy of a single result, not the seed plus the new result.

I was talking about a RNG, not a PRNG. A RNG doesn't have a seed.

For a PRNG the best possibility is that the entropy of the seed is evenly distributed over the entropy of the output. If your seed has no entropy then the output is fully deterministic and has no entropy either. A PRNG cannot manufacture entropy, all it can do is spread it out evenly.

rjbeery said:

I would imagine that the entropy of ##S_{n+1}## = [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1] would be higher than [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] in your example.

Yes. The first would be about 3.2 nats and the second would be 1 nat.

Stephen Tashi · Dec 29, 2015

rjbeery said:

In order to maximize the randomness of PRNG we want PRNG(Sn) to return a result such that the entropy of Sn+1 is also maximized.

How do you define entropy without having a situation where there is probability? If you have a particular sample of a random process you could define an estimate of the entropy of the process as a calculation done on the data, but an estimate of a statistical property of a population isn't the same as the property itself.

Hornbein · Dec 29, 2015

_PJ_ said:

Therefore, there is no means (excluding perhaps some clever use of qubits?) for a computer to generate random numbers.

Simply hook it up to a white noise generator. This was tried in the past, but PRNGs gave better results. Besides, it is very useful to be able to repeat the run exactly. Truly random numbers are a hassle.

It is possible to buy a random tone generator for $50 or so. It gets boring quickly. It has a quite distinctive sound, which you would recognize from soundtracks. It is used to signify hi-tech situations.

rjbeery · Dec 30, 2015

Stephen Tashi said:

How do you define entropy without having a situation where there is probability? If you have a particular sample of a random process you could define an estimate of the entropy of the process as a calculation done on the data, but an estimate of a statistical property of a population isn't the same as the property itself.

I'm straddling disciplines here between Info Theory and Computer Science. The probability comes from the string of digits in the seed under the presumption that there is no bias in their appearance.

chiro · Dec 30, 2015

You also have to understand that entropy - like any representation in any language has a basis.

In one basis the entropy can be quite low.

For example - if you know an algorithm (like a pseudo-random number generator) then you can construct a basis where the entropy of that system is very tiny.

But under another basis it can be extremely large and be analyzed as completely random.

You should look at Kolmogorov Complexity and understand that even that (and the computational paradigm it uses) also has a basis.

atyy · Jan 1, 2016

rjbeery said:

Agreed, I just find it philosophically interesting that a "perfect" RNG is not the "best" RNG.

That is well known, and a fundamental assumption in physics. It arises because the fundamental laws of classical physics are deterministic, yet we believe in the second law of thermodynamics. The situation in quantum mechanics is less certain, but at least up to QED, one can say that there is conceivably no true randomness in quantum mechanics either. However, as long as we cannot signal faster than light, then we cannot access the deterministic variables that conceivably underlie QED, and quantum mechanics can certify randomness (via a Bell inequality violation as in http://arxiv.org/abs/0911.2504).

_PJ_ · Jan 15, 2016

Hornbein said:

Simply hook it up to a white noise generator.

Yes but this is NOT the computer generating random numbers, it's merely translating them.

Besides, it is very useful to be able to repeat the run exactly.

To obtain the same sequence? If this is the case then this is just a substitute for any other pseudo generator, and there is an algorithm inputting the seed and outputting results.

I'm not trying to be a smartarsed pedant about this, only that given the nature of the topic I feel it really is a critical distinction.

Hornbein · Jan 15, 2016

_PJ_ said:

Yes but this is NOT the computer generating random numbers, it's merely translating them.

To me a computer is a pile of hardware and software. IMO there is no requirement that the random number be generated digitally or algorithmically. You are free to use a differing definition of a computer.

_PJ_ said:

To obtain the same sequence? If this is the case then this is just a substitute for any other pseudo generator, and there is an algorithm inputting the seed and outputting results.

I was pointing out an advantage of pseudorandom numbers.

_PJ_ · Jan 15, 2016

Hornbein said:

To me a computer is a pile of hardware and software. IMO there is no requirement that the random number be generated digitally or algorithmically. You are free to use a differing definition of a computer.

What do you mean "digitally"?
My point had nothing to do with the definition of a computer, only that the random numbers were not generated by a computer. Computers cannot generate random numbers.
I had assumed the WNG perhaps acquired some form of random input in order to produce the results which your computer then translated to numbers.

If I am correct now in that you meant the WNG actually generates the data then again, these are not random.

[/quote]I was pointing out an advantage of pseudorandom numbers.[/QUOTE]
And yet the whole topic is with regards to

Hornbein · Jan 15, 2016

_PJ_ said:

Computers cannot generate random numbers.

Oh. Well I guess that settles it.

"Randomness Through Entropy" Paradox

Discussion Overview

Discussion Character

Main Points Raised

Areas of Agreement / Disagreement

Contextual Notes

Similar threads

Graduate Hypothesis testing: Defining H0, HA hypotheses so that ( H_A)_A' makes sense

Undergrad My basic understanding of set theory

Undergrad How do E[X] and E[|X|] relate?

Graduate Expected numbers of cards of a last color remaining

Undergrad Understanding permutations and combinations in a coin toss experiment

Insights Remote Operated Gate Control System

Insights AI Enriched Problem Solving

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight