# Random.randrange() is really random?

1. Dec 24, 2015

### ChrisVer

I was wondering how could I check how random the random.randrange() method of python's random package (?or class?) is...
How could I do that by building a macro?
Maybe check the two hypothesis: is good random generator vs is not?

2. Dec 24, 2015

### Staff: Mentor

There is the random module (Lib/random.py), which contains the Random class.
The documention (v3.4.2) contains this warning in its documentation of the random module:
Regarding randrange(), the docs say this:
I don't know what you mean by "building a macro".

Last edited by a moderator: May 7, 2017
3. Dec 24, 2015

### Staff: Mentor

4. Dec 24, 2015

### ChrisVer

I am thinking something like the following:
Code (Python):

import random
event = 0
number = 0
lista = []
occuranc = []

while event<10000:
x = random.randrange(0,100)
lista.append(x)
event += 1

while number<100:
y = lista.count(number)
occuranc.append(y)
number += 1

Here I used the randrange to generate 10000 numbers between 0 and 100 (saved in the list "lista"). In the second while I am counting the multiplicity of appearence of 0, 1, 2, ...,100 in those 10000 numbers (the multiplicity is saved into my "occuranc" list).
Normally if the generator produces the numbers from a uniform distribution the probability of occurance for each number should be (1/100) and so each number should appear with multiplicity 10000/100=100.

So I am not sure how to continue the above code into making the chi-squared test of this hypothesis. In other words I would like to test by myself at what confidence level I can say that the randrange is a uniform-distribution generator. Like the test should check whether if you plot the occurance vs number whether it will be described by a $y=const$ relation or anything else (uniform pdf or not).

Last edited: Dec 24, 2015
5. Dec 24, 2015

### Staff: Mentor

You have your list of frequencies. Calculate the sum of the squared differences between your frequencies and the theoretical value it should have (100). The process is laid out here - https://en.wikipedia.org/wiki/Pearson's_chi-squared_test - in the definition section. (Apologies if you already know this...)

If you need help implementing this process in Python, give a holler.

6. Dec 24, 2015

### FactChecker

I would be surprised if any of the standard random number generators would not pass a Chi-squared goodness of fit test.
That being said, none of the "pseudo-random" random number generators are truly random. There is always a test that is sophisticated enough to determine that they are not random. Unless you are going to apply statistical methods like Box-Jenkins time series analysis, the standard random number generators will probably work fine for you. But if you are going to use those methods, you should test your random number generator with them first. Verify that the generator will look random when those methods are applied.

Last edited: Dec 24, 2015
7. Dec 24, 2015

### ChrisVer

I think I made it... and passed the test by p-value ~0.26 with one test [the rest are almost the same]. So yes, it passed the test. I was a bit stuck with how to implement the code part for the chi2 determination in python because I didn't have my coffee on the desk (a good reasoning to avoid saying I was thinking dumb at that moment).

ha, quite intriguing... probably I will try to make that pseudorandom numb generator fail next.

Yup, that book contains the same test I am trying to apply, but what it actually tests there is whether given a uniform distribution it works fine to go to a non-uniform one (or so I understood) . It was interesting reading however, and almost the same method I did (the "bucket method" with 100 bins is the same I tried).

8. Dec 25, 2015

### FactChecker

If your work involves a number generator and the work requires you to use more sensitive statistical tests, you should make sure that the number generator is good enough. I think problems would be rare. Most real-world applications of techniques like the Box-Jenkins time series analysis are on real data, not generated data. Of course, examples for a class might be artificially generated. That is where I encountered a problem once, long ago. I generated artificial data for classwork and the number generator contained bad autocorrelations.

Know someone interested in this topic? Share this thread via Reddit, Google+, Twitter, or Facebook