Testing how much random is my sample

  • Context: Undergrad 
  • Thread starter Thread starter fluidistic
  • Start date Start date
  • Tags Tags
    Random Testing
Click For Summary

Discussion Overview

The discussion revolves around testing the randomness of a sample of natural numbers, which is expected to follow a Gaussian distribution. Participants explore various statistical tests and methods to assess the randomness of the sample, including both visual and quantitative approaches.

Discussion Character

  • Exploratory
  • Technical explanation
  • Debate/contested

Main Points Raised

  • One participant suggests performing a Gaussian fit and comparing residuals with true random numbers, but later questions the utility of this approach for assessing randomness.
  • Another participant recommends plotting the data on Gaussian probability paper to visually assess how closely it follows a Gaussian distribution.
  • Quantitative tests are proposed, including computing skew and kurtosis, and checking if the mean, mode, and median are equal, which are characteristics of a normal distribution.
  • A participant mentions using the Shapiro-Wilk test, reporting a high Kendall's W and a p-value indicating a likelihood of Gaussian distribution, but expresses uncertainty about its implications for randomness.
  • One participant asks for clarification on whether the original distribution is known to be normal and suggests the runs test as a non-parametric method for assessing randomness.
  • A later reply indicates that a similar test conducted with R programming suggested a high probability of randomness in the data.

Areas of Agreement / Disagreement

Participants present multiple competing views on how to assess randomness, with no consensus on a single method. Some methods are proposed, but the effectiveness and relevance of these methods remain debated.

Contextual Notes

Participants express uncertainty about the implications of certain tests for randomness, and there are limitations in the assumptions made regarding the original distribution and the nature of the sample.

Who May Find This Useful

Individuals interested in statistical methods for assessing randomness, particularly in the context of Gaussian distributions, may find this discussion relevant.

fluidistic
Gold Member
Messages
3,934
Reaction score
283
Hello guys,
I have a sample of about 400 natural numbers though I can get more numbers. To give you an idea the mean and the standard deviation are 29038031 and 1842882 respectively and I expect the numbers to follow a Gaussian distribution. I'd like to perform a test to tell me the probability that my sample is truly random. I just don't know which test to perform. I've read about diehard tests but I don't see how I could apply them.

So I'd like to hear some suggestions. Thanks!

Edit: 1st idea that I have: get more numbers. Then perform a Gaussian fit and calculate the residuals. Do the same for true random numbers following a Gaussian with the same mean and standard deviation and compare the residuals. I expect lower residuals with the true random numbers.

Edit2: Nevermind this idea would be useless. It would tell me how far from a Gaussian my distribution of numbers is, not how random they are...
 
Last edited:
Physics news on Phys.org
A simple and intuitive way is to plot your histogrammed data on Gaussian "probability paper," so named from the days when plots were made on actual graph paper. You'll see visually how close to a Gaussian you are.
http://en.wikipedia.org/wiki/Normal_probability_plot

Other simple, but quantitative, tests include:
a) Compute the skew and kurtosis and see how close to Gaussian they are (normal values are 0 and 3).
b) The mean, mode and median are all equal for a normal distribution.

More sophisticated tests abound. See, e.g.,
http://en.wikipedia.org/wiki/Normality_test
 
marcusl said:
A simple and intuitive way is to plot your histogrammed data on Gaussian "probability paper," so named from the days when plots were made on actual graph paper. You'll see visually how close to a Gaussian you are.
http://en.wikipedia.org/wiki/Normal_probability_plot

Other simple, but quantitative, tests include:
a) Compute the skew and kurtosis and see how close to Gaussian they are (normal values are 0 and 3).
b) The mean, mode and median are all equal for a normal distribution.

More sophisticated tests abound. See, e.g.,
http://en.wikipedia.org/wiki/Normality_test
I've just used the software maxima which performed a Shapiro-Wilk test to check whether my data follows a Gaussian and I think there are high chances that it does: it returned a Kendall's W of over 0.99 with a p-value near 0.27.
The thing is that I am not sure that this is telling me anything about the randomness of my numbers which is what I'm looking for.
 
  • Like
Likes   Reactions: fluidistic
WWGD said:
I am not sure I understood; do you know that the original distribution is normal and then you want to know if the sample is random? Have you tried the runs test?

https://home.ubalt.edu/ntsbarsh/business-stat/opre504.htm#rrunstest

And a good thing is that the test is non-parametric.
Thanks a lot! That's exactly what I was looking for, I'll try tomorrow.
Meanwhile I tried a very similar test with R programming and the result was that there is a high probability that my data is random. (p-value was over 0.4 and the null hypothesis is that the data is random while the alternative hypothesis was non randomness in the data).
 
Glad I could help.
 

Similar threads

  • · Replies 5 ·
Replies
5
Views
3K
  • · Replies 30 ·
2
Replies
30
Views
5K
  • · Replies 6 ·
Replies
6
Views
3K
  • · Replies 3 ·
Replies
3
Views
3K
  • · Replies 1 ·
Replies
1
Views
2K
  • · Replies 3 ·
Replies
3
Views
849
Replies
1
Views
1K
  • · Replies 7 ·
Replies
7
Views
3K
Replies
1
Views
1K
  • · Replies 11 ·
Replies
11
Views
3K