How to distinguish the decimal expansions of irrational numbers from random numbers?

sysprog · May 4, 2018

How do we distinguish the decimal expansions of irrational numbers, and products thereof, from random sequences?

Is
an arbitrarily specified (not claimed to be perfectly randomly selected) numeric string,
e.g.
the 10^10th to 10^19th digits of the decimal extraction of the square root of 2.2, IFFd against (modulo 2 binarily subtracted from) or XORd against (modulo 2 binarily added to) a same-length sequence of some other cut from some similarly generated decimal expansion,
just as random-appearing
as is,
e.g.
an XOR of a digitization of 2 microwave background radiation streams taken from 2 different radio telescope arrays
?

The expansion products can be algorithmically specified briefly, while the observation products cannot be definitively expressed in brief.

If I don't tell anyone how I generated a long number sequence, and it isn't psuedorandom in the sense of using a short random seed to generate a long PRN that after a long period repeats, unless my generation routine has some predictability, how could it be discerned given only its output?

Might that idea have an applicability to strong cryptography with manageably short keys?

Could I transmit a brief specification of an arbitrarily long irrational number product decimal expansion sequence, and thereby produce a key for a symmetric encryption/decryption surjective function, that was hard-to-guess based on how hard to guess my generation method was, or might my pattern render the generation procedure specification, based on its short length, more easy to infer than the generated sequence would be to guess?

.Scott · May 4, 2018

Of course, there would be a significant technology problem in generating "the 10^10th to 10^19th digits of the decimal extraction of the square root of 2.2".
Enough so, that it would defeat its use as a cryptographic one-time pad - or other key.

Other than that, there are many useful definitions of "random" and among them is whether the random sequence can be reproduced with an algorithm that is smaller than the sequence. This makes the definition of the algorithm a kind of password - but for many of us, it is easier to remember a 100-character algorithm than a 10-character password.

sysprog · May 4, 2018

.Scott said:

Of course, there would be a significant technology problem in generating "the 10^10th to 10^19th digits of the decimal extraction of the square root of 2.2".
Enough so, that it would defeat its use as a cryptographic one-time pad - or other key.

Other than that, there are many useful definitions of "random" and among them is whether the random sequence can be reproduced with an algorithm that is smaller than the sequence. This makes the definition of the algorithm a kind of password - but for many of us, it is easier to remember a 100-character algorithm than a 10-character password.

Questions I'm pondering here include whether transformed and combined sequences of decimal expansions of irrational numbers might, due to their computational generability, have a recognizability that distinguishes them from similarly transformed and combined recordings of outputs of natural random processes, and if so, how?

FactChecker · May 4, 2018

For many applications, you would need to be sure that the pseudorandom sequence behaves like a true random sequence in all reasonable statistical tests. That is not something that you can be sure of with a randomly selected irrational number. There has been a great deal of work done to make pseudorandom number generators with known properties. The more sophisticated statistical tests are hard to beat. Replacing a good pseudorandom number generator with the digits of a "luck of the draw" irrational number is a mistake.

sysprog · May 4, 2018

FactChecker said:

For many applications, you would need to be sure that the pseudorandom sequence behaves like a true random sequence in all reasonable statistical tests. That is not something that you can be sure of with a randomly selected irrational number. There has been a great deal of work done to make pseudorandom number generators with known properties. The more sophisticated statistical tests are hard to beat. Replacing a good pseudorandom number generator with the digits of a "luck of the draw" irrational number is a mistake.

If a number sequence fails randomness tests, that may or may not increase the predictability of its successor sequences. A tongue-in-cheek remark by Douglas Hofstadter (a version of this is attributed to Feynman):

I myself once learned 380 digits of π, when I was a crazy high-school kid. My never-attained ambition was to reach the spot, 762 digits out in the decimal expansion, where it goes "999999", so that I could recite it out loud, come to those six 9's, and then impishly say, "and so on!"

— Douglas Hofstadter, Metamagical Themas

FactChecker · May 4, 2018

sysprog said:

If a number sequence fails randomness tests, that may or may not increase the predictability of its successor sequences.

"may or may not" is not always good enough. If you use a sequence that has a pattern, it will be broken.

sysprog · May 4, 2018

FactChecker said:

"may or may not" is not always good enough. If you use a sequence that has a pattern, it will be broken.

I'm not sure that all deterministic processes produce outputs in which patterns are discernible.

FactChecker · May 4, 2018

sysprog said:

I'm not sure that all deterministic processes produce outputs in which patterns are discernible.

Right. Some do and some don't. That is why it has been studied so much. There are many books and research papers on the subject. If one advocates a method, he should be prepared to show that there will be no detectable pattern in the numbers that will be used in an appication. That is more difficult than showing that any pattern will be broken in the long run. It must also be shown for the short run of a typical application.

DrZoidberg · May 4, 2018

sysprog said:

Questions I'm pondering here include whether transformed and combined sequences of decimal expansions of irrational numbers might, due to their computational generability, have a recognizability that distinguishes them from similarly transformed and combined recordings of outputs of natural random processes, and if so, how?

A truly random natural process could produce any arbitrary sequence of digits. So no matter which irrational number you pick, if you select x digits from that number they could by pure coincidence look exactly the same as some subsequence of the digits produced by said natural process.
And the other way around any sequence produced by that natural process could also be called an irrational number and it's always possible to find an algotithm that will produce that sequence.
You could however ask how complex that algorithm has to be. How complex is the simplest possible algorithm that will produce those digits. That could serve as a measure for the cryptographic strength of that sequence.

sysprog · May 4, 2018

DrZoidberg said:

A truly random natural process could produce any arbitrary sequence of digits. So no matter which irrational number you pick, if you select x digits from that number they could by pure coincidence look exactly the same as some subsequence of the digits produced by said natural process.

Yes.

And the other way around any sequence produced by that natural process could also be called an irrational number and it's always possible to find an algotithm that will produce that sequence.

Irrational numbers are not random, and if a given number sequence is originally of random derivation, any afterthought algorithm used to definitely produce exactly that given number would have to be specific to that number, and therefore non-random.

You could however ask how complex that algorithm has to be. How complex is the simplest possible algorithm that will produce those digits. That could serve as a measure for the cryptographic strength of that sequence.

I'm seeking to get at how discernible a generation function might be by analysis of its output, given that the output is non-repeating, and satisfies at least the usual battery of randomness tests.

Baluncore · May 6, 2018

Any limited sequence you use will presumably be present in every irrational number, if you search for long enough.

A cipher weakness is never obvious, until you find it. Patching a weakness leads to a step-wise evolution of the system, that can be climbed by a cryptanalyst.

As an example; With the exception of the perfect squares, the square roots of integers are all irrational. But when those roots are written as continued fractions, the terms always recur, in a predictable way.

Once you are convinced that the Pseudo Random Binary Sequence from a Linear Feedback Shift Register is secure, because it will not repeat during use, you should study the Berlekamp–Massey algorithm. That should quickly change your conviction when it generates a minimum equivalent polynomial that correctly extrapolates your sequence.
https://en.wikipedia.org/wiki/Berlekamp–Massey_algorithm
https://en.wikipedia.org/wiki/Linear-feedback_shift_register

FactChecker · May 6, 2018

Baluncore said:

Any limited sequence you use will presumably be present in every irrational number, if you search for long enough.

That sounds wrong. What about 0.101001000100001000001000000100000001...? I think that must be irrational. (Right now, I don't see how to prove that so maybe I am deceived.) I'm not very familiar with trancendental numbers; maybe they have the property that you asserted.

Vanadium 50 · May 6, 2018

I don't think "random" and "non-random" are sufficiently well defined. I believe the property that you are trying to get at is normality.

sysprog · May 6, 2018

Vanadium 50 said:

I don't think "random" and "non-random" are sufficiently well defined. I believe the property that you are trying to get at is normality.

It's important, I think, to distinguish between a deterministic process that produces sequences that satisfy a standard battery of randomness tests, and a non-deterministic process that also does that.

Regarding normality, direct bias and other disproportionate probabilities can be removed by post-conditioning a stream from either kind of process.

From a cryptanalytic point of view, how do we recognize that a string was generated by an obscured deterministic function, e.g. by hashing irrationals against other irrationals, and not by sampling of noise from a raw entropy source?

How, for example, might a cryptanalyst detect that some deterministic generative function, perhaps one as simple to specify as √2 ⊕ √3 , had been used to produce a string, given only the string?

If there is an always present and apprehensible commonality among deterministically produced strings that would allow reliably discerning whether some deterministic procedure had been used or not, without necessarily allowing full exposition of the generative function, but at least allowing reliable distinguishing between deterministic and non-deterministic generative function, that would be a useful addition to a battery of randomness tests.

Baluncore · May 6, 2018

Vanadium 50 said:

I believe the property that you are trying to get at is normality.

Baluncore said:

Any limited sequence you use will presumably be present in every irrational number, if you search for long enough.

I am alluding to the Infinite Monkey Theorem. I should change my first statement to read;
“Any limited sequence you use will presumably be present in every normal irrational number, if you search for long enough.”

sysprog said:

How, for example, might a cryptanalyst detect that some deterministic generative function, perhaps one as simple to specify as √2 ⊕ √3 , had been used to produce a string, given only the string?

That would be very difficult given only a short historical sequence. But when needed, that difficulty is circumvented by the "practical cryptanalysis" process, which gathers additional information by employing traitors, safe crackers and hackers to identify the systems employed.

sysprog · May 8, 2018

Baluncore said:

I am alluding to the Infinite Monkey Theorem. I should change my first statement to read;
“Any limited sequence you use will presumably be present in every normal irrational number, if you search for long enough.”

The quoted statement is equivalent to a statement that normal numbers are disjunctive, which is true, but there's no proof or disproof that any algebraic irrational number is normal.

In searching around, I found this expository paper with some insights of Borel and others regarding normal numbers.

It includes the provocative (to me at least) statement: "Finite-state, finite-time random number generators do not exist.".

Baluncore · May 8, 2018

sysprog said:

It includes the provocative (to me at least) statement: "Finite-state, finite-time random number generators do not exist.".

If two identical “finite-state, finite-time” machines are constructed then, given the same initial conditions, they should be expected to produce two identical output streams. They are 'pseudo-random', not 'truly-random' number generators. The word "deterministic" comes to mind. Philosophically, is it is the presence of small immeasurable influences that makes the difference between truly-random and pseudo-random?

Pseudo-random generators can be used to greatly extend the length of a short initial key distributed over a private channel, but the greater the extension, the greater the weakness. Cryptographically, the use of PRNGs comes down to a trade between security and the cost of key distribution.

Quantum entanglement may lead to an improved distribution system, but who is to say that the strings that entangle two quantum states do not also pass through the enemies domain.

The earlier example of the bitstream generated by √2 ⊕ √3 may be unfathomable at first sight. But there is a weakness in that ⊕ is a bitwise multiply of two bit streams. But the irrational √2 * √3 = √6, which is also irrational, but the square is exactly the composite integer 6. Investigating bit serial algorithms that generate √n as bit streams, then blending them with an ⊕ may well lead to a total collapse of the complexity and security.

I fear simplicity and complexity. If it is simple, it will be easier to crack. If it is complex, the key distribution and cipher clerks and will make mistakes, then while sorting out their mess, will give more of the game away.

sysprog · May 9, 2018

⊕ (XOR) is modulo 2 binary addition.

Baluncore · May 9, 2018

sysprog said:

⊕ (XOR) is modulo 2 binary addition.

I agree that XOR can make one half of a binary half adder, but cryptographically it is better to think of XOR as a simple multiply because it is a non-linear conditional inverter. It generates frequency components in the output bit stream not present in the input streams. That is a characteristic of multiplication rather than addition.

sysprog · May 9, 2018

Baluncore said:

I agree that XOR can make one half of a binary half adder, but cryptographically it is better to think of XOR as a simple multiply because it is a non-linear conditional inverter. It generates frequency components in the output bit stream not present in the input streams. That is a characteristic of multiplication rather than addition.

XOR means one or the other but not both, so:

P ⊕ K ⇒ E

Code:

  P          K         E
b'0000' ⊕ b'1111' ⇒ b'1111' (encrypted text plus knowledge of the operation tells us that the key is the complement of the plaintext)
b'1010' ⊕ b'0101' ⇒ b'1111' (again all we can tell from E is that K is the complement of P)
b'1111' ⊕ b'1111' ⇒ b'0000' (b'1' + b'1' mod 2 = b'0' -- the output still tells us nothing about either K or P except that they're complementary)

and unlike multiplication, it's a symmetrically reversible operation, so you don't have to tell the routine whether it's encrypting or decrypting:

E ⊕ K ⇒ P

Code:

  E          K         P
b'1111' ⊕ b'1111' ⇒ b'0000'
b'1111' ⊕ b'0101' ⇒ b'1010'
b'0000' ⊕ b'1111' ⇒ b'1111'

and if you have P and E, you can get K:

P ⊕ E ⇒ K

Code:

  P          E         K
b'0000' ⊕ b'1111' ⇒ b'1111'
b'1010' ⊕ b'1111' ⇒ b'0101'
b'1111' ⊕ b'0000' ⇒ b'1111'

If you use a truly (non-deterministically generated normal etc.) random number string for one of the inputs, the output string will not have more predictability, even if the other input string is non-random. This is easily seen if you XOR such a random number string against a simply non-random number string such as all 1s or all 0s.

XNOR (IFF) is the inverse of XOR, and exhibits the same properties inversely, i.e. it produces the 1s complement of what XOR produces.

Baluncore · May 9, 2018

sysprog said:

XNOR (IFF) is the inverse of XOR, and exhibits the same properties inversely, and is mod 2 binary subtraction; not division.

I did not come down with the last shower. I think you have a lot to learn.

sysprog · May 9, 2018

Please excuse me. That sentence in my previous post is now corrected. Thanks for your careful reading.

sysprog · May 11, 2018

I'm past the time limit for editing my prior posts in this thread, and I wanted to add a reference link to one of them, so I'll reiterate the post and add the reference link:

sysprog said:

⊕ (XOR) is modulo 2 binary addition.

https://en.wikipedia.org/wiki/XOR_cipher

Baluncore · May 11, 2018

Given two signals or data streams, there are two distinct ways of merging those signals into one stream. Firstly, there are audio mixers where the two signals are added in time to produce a signal that contains the sum of the signals and the simple sum of the frequency spectra. Secondly, there are frequency mixers where the two signals are multiplied together to produce sum and difference frequencies in addition to the original signal frequencies.

It is obvious that confusion of the spectral characteristics, by employing multiplication to spread the energy over the whole channel spectrum, will make cryptanalysis significantly more difficult than in the much simpler case of addition.

Direct sequence spread spectrum employs multipliers or XOR gates to scatter many signals into one signal channel. Knowing the spreading sequence of each signal makes it possible to reverse the process and extract the signals at the other end. It is difficult to read or jam a spread spectrum signal because the jammer sequence or carrier gets spread, while the signal is unspread by the correct key sequence.

To sum it all up, multiplication spreads the spectrum, addition does not.

Zero and One are only symbols that sometimes represent true and false. They take the meaning or numeric values they are assigned for any particular application. An analogue signal can be digitised to a 1 bit stream using a single comparator to produce the sign bit of two's complement data; Let 0 represent all positive values, assign it the numeric value +1; and let 1 represent all negative values, assign it the numeric value –1.
The XOR function then performs a perfect reversible multiplication, without any carry to ignore.

Code:

0 ⊕ 0 = 0, likewise +1 * +1 = +1
0 ⊕ 1 = 1, likewise +1 * –1 = –1
1 ⊕ 0 = 1, likewise –1 * +1 = –1
1 ⊕ 1 = 0, likewise –1 * –1 = +1

The XOR used like that is properly seen as a multiplication because, unlike addition, it performs a one bit wide multiplication that spreads the spectrum.

You can insist that XOR looks like modulo 2 addition to you, but it certainly does not look like addition in signal systems when I do the statistics and spectrum analysis. Just because the XOR gate can be used to make the least significant bit of a half-adder, does not make an XOR gate a linear adder in a cryptosystem. There it behaves like a multiplier, as a non-linear frequency spreading mixer, or as a direct sequence modulator.

It can get interesting when investigating the statistical analysis of data streams merged by XOR gates. We can consider the probability of input 'a' being true as the value a, being a real number between zero and one. Likewise the probability of an independent input 'b' being true takes the value b.
If c = a XOR b, then what is the probability of 'c' being true ?

From the XOR logic you can generate; c = b * ( 1 – a ) + a * ( 1 – b ),
which, being the sum of two multiplies, can be written as c = a+b – 2ab.

The a+b term looks like addition, while the –2ab term looks like multiplication.
Notice that if either input is quite unknown, with p(0.5), then c = 0.5 just as expected.
Here is a simple probability table.

Code:

XOR  b 0.00 0.25 0.50 0.75 1.00
  a
0.00   0.00 0.25 0.50 0.75 1.00
0.25   0.25 0.38 0.50 0.63 0.75
0.50   0.50 0.50 0.50 0.50 0.50
0.75   0.75 0.63 0.50 0.38 0.25
1.00   1.00 0.75 0.50 0.25 0.00

The strength of XOR in cryptography comes from the spectral confusion generated by multiplication. That is derived from the non-linear frequency mixing 2ab component of the probability, which is simply DC offset by the a+b term.

If like a Mumpsimus, you continue to argue that the XOR gate in a cryptosystem is actually an adder then you might as well shoot yourself in the foot, because by denying the multiplication required to spread the spectrum, you are also crippling your understanding and reasoning of the XOR process in cryptosystems.

sysprog · May 11, 2018

Excerpted from https://en.wikipedia.org/wiki/GF(2):

I recognize that a thing doesn't have to be easy to see for you to be able to see it; however, it's easy, from the Definition section in that excerpt, to see why I would say that XOR is modulo 2 binary addition, and that bitwise multiplication is AND -- I'm in agreement with the abstract algebra mainstream on that.

Even so, that doesn't negate the value of your observation to the effect that we can (by, in your example, using a comparator to obtain the sign bit of twos complement data) take the high and low thirds of a value spectrum and call them +1 and -1 respectively, so we can use multiplication instead of modular addition and produce the same logical result.

However, that requires a three-valued system with the middle value ignored, which is functionally equivalent to using binary modular addition -- i.e. ignoring the zero range in between the +1 and -1 thresholds is viewable as informationally equivalent to discarding the carry.

So you an' me now, we got to go down to the beach an' box, ' cause y'all called me a Mumpsimus.

I liked your post and its approach. I gave it a couple of re-reads. Thanks.

.Scott · May 29, 2018

FactChecker said:

That sounds wrong. What about 0.101001000100001000001000000100000001...? I think that must be irrational. (Right now, I don't see how to prove that so maybe I am deceived.) I'm not very familiar with transcendental numbers; maybe they have the property that you asserted.

It is easy to prove that it is not rational. If I give you a bit string of N bits, the first and last are 1 and all the others are zero, then you will always be able to tell me the one place in your number where this pattern occurs. This is true no matter how large N is and no matter how deep into the number you go. Thus, the digits do not ever begin repeating. Thus it is irrational.

Thus your point was well-made, since clearly that decimal number does not contain many patterns.

As for whether it is transcendental, all I can say is that it doesn't look it: ##10^{-1} + 10^{-3} + 10^{-6} + 10^{-10} + ... + 10^{-(N(N+1)/2)} +...##
I'm not going to try to make that a solution to a finite polynomial - but I'd bet there is one.

How to distinguish the decimal expansions of irrational numbers from random numbers?

Attachments

1. How can I tell if a decimal expansion is irrational or random?

2. Can irrational numbers have repeating decimals?

3. Are there any shortcuts or tricks to quickly identify irrational numbers?

4. Are all non-repeating decimals irrational?

5. Can irrational numbers be written in any other forms besides decimal expansions?

Similar threads

Hot Threads

Recent Insights