How to compute the probability that a given set of numbers is random

In summary, Randomness can be checked by evaluating the output of a process and comparing it to a uniform distribution. However, there is always a small probability that the process may not be truly random or that the chosen distribution is incorrect. One way to determine the probability of a random process is to consider an alternate hypothesis and use statistical tests to gather evidence for or against it. It is also important to consider prior probabilities in making this determination.
  • #1
mrspeedybob
869
65
Suppose you have a set of digits, for the sake of simplicity we'll make them binary, how would you determine the probability that the set is random? For example, given the following 3 strings of numbers...

1111111111111111111111111111111111111111111111111111111111111111
0010010000111111011010101000100010000101101000110000100011010011
1101011111001000000000011101111011111010010101011111101010110110



The first appears non random while the second and third appear random. In reality, only the third actually is random. Given a truly random string of 1s and 0s all 3 are equally likely. So If I didn't know the origin of the numbers how would I compute a probability that they were generated randomly?

For the sake of simplicity assume that if they are random then the probability of 1 is the same as the probability of 0 and that each digit selected without consideration of any other digit.
 
Physics news on Phys.org
  • #2
mrspeedybob said:
Suppose you have a set of digits, for the sake of simplicity we'll make them binary, how would you determine the probability that the set is random? For example, given the following 3 strings of numbers...

1111111111111111111111111111111111111111111111111111111111111111
0010010000111111011010101000100010000101101000110000100011010011
1101011111001000000000011101111011111010010101011111101010110110
The first appears non random while the second and third appear random. In reality, only the third actually is random. Given a truly random string of 1s and 0s all 3 are equally likely. So If I didn't know the origin of the numbers how would I compute a probability that they were generated randomly?

For the sake of simplicity assume that if they are random then the probability of 1 is the same as the probability of 0 and that each digit selected without consideration of any other digit.

You can't always know. It really depends on how the sequence is generated. The decimal expansion of an irrational number appears random, but is fully determined by an algorithm. A truly random sequence cannot generated by an algorithm unless it includes a randomizing process. It is currently accepted that only quantum processes are truly random, but other processes are effectively random in that no algorithm can be specified.

Randomness can be checked by evaluating the output of a process. Random processes can often be described by their distributions. So if for a string of digits such that every digit is equally likely, then the process should converge to a uniform distribution. However there is always a vanishingly small probability that it may not or that your choice of a distribution is wrong.
 
Last edited:
  • #3
The "probability that the sequence came from some distribution" actually isn't well-defined. In some cases the "probability that distribution D *would* generate the sequence" is a good proxy, but that won't help in this case. This is because the distribution you're interested in is, by definition, the distribution for which all sequences have the same probability.

If you are willing to assume that the dependencies are the same for each digit (if digit 3 has some dependency on digit 1, then digit n+3 has that same dependency on digit n+1), then each subsequence of n digits (for all n) is identically distributed (but not necessarily independent). You can check whether the proportion of 1s is close to 1/2, and you can furthermore check whether the proportion of each 2-character sequence is close to 1/4.

In general, (subject to the assumption above) if the proportion of each n-subsequence is close to 2-n, (with additional weight given to smaller n, since we have a larger sample of them) then it is likely to be random.

This is not a formal procedure, but could form the basis of a formal hypothesis test (and if indeed it works, this has probably been done before).
 
  • #4
mrspeedybob said:
So If I didn't know the origin of the numbers how would I compute a probability that they were generated randomly?
You can't.

What you can do is consider an alternate hypothesis, and do a statistical test to obtain evidence for or against the random generation hypothesis vs the alternate hypothesis.

Ideally, you'd want to involve prior probabilities too -- e.g. if you have very high confidence that a particular source is trustworthy and claims a sequence is random, that should affect whatever inference you make after gaining evidence from the statistical test.


Another thing you can do is to decide upon a statistical test before looking at the sequences (e.g. these tests), and then use the test to obtain a level of "confidence". In general this isn't so great, but it's a decent heuristic if you have nothing else to do.
 
  • #5



There are several methods that can be used to compute the probability that a given set of numbers is random. One approach is to use the concept of entropy, which measures the randomness or uncertainty in a system. In this case, we can calculate the Shannon entropy of the set of numbers, which is a measure of the average amount of information contained in each digit. A high entropy value indicates a high level of randomness, while a low entropy value suggests a more predictable pattern.

Another method is to use statistical tests such as the chi-square test or the Kolmogorov-Smirnov test. These tests can determine the likelihood that the observed pattern in the set of numbers is due to chance or randomness. If the p-value obtained from the test is below a predetermined threshold (usually 0.05), then we can reject the null hypothesis that the set of numbers is random.

Additionally, we can also use algorithms such as the Monobit test or the Runs test, which analyze the frequency of 1s and 0s in the set of numbers and compare it to the expected frequency in a random sequence. If the observed frequency is significantly different from the expected frequency, then the set of numbers is likely not random.

In the case of binary numbers, where the probability of 1 and 0 is assumed to be equal, we can also compute the probability of obtaining a specific sequence of numbers by chance using the binomial distribution. This approach requires knowing the length of the sequence and the probability of success (1 or 0) for each digit. The resulting probability can then be compared to a predetermined threshold to determine if the set of numbers is random or not.

In summary, there are various methods that can be used to compute the probability that a given set of numbers is random. These methods rely on different statistical techniques and assumptions, and the most appropriate one to use may depend on the specific characteristics of the set of numbers being analyzed.
 

1. What is the definition of a random set of numbers?

A random set of numbers is a sequence of numbers that has no predictable pattern or order. Each number in the set has an equal chance of occurring and the order in which they appear is determined by chance.

2. How can I determine the probability that a set of numbers is random?

The probability of a set of numbers being random can be computed by dividing the number of possible random outcomes by the total number of possible outcomes. For example, if we have a set of 10 numbers and each number can range from 1 to 10, the total number of possible outcomes is 10 to the power of 10 (10^10). The number of possible random outcomes would be 10! (10 factorial), which is the number of ways the numbers can be arranged in a random order. Therefore, the probability of the set being random is 10!/10^10 = 1/10^9 or 0.000000001.

3. Can a set of numbers be partially random?

No, a set of numbers is either completely random or not random at all. If there is any predictable pattern or order to the numbers, it cannot be considered random.

4. Is there a formula for computing the probability of a set of numbers being random?

Yes, the formula for computing the probability of a set of numbers being random is P = n!/N^n, where P is the probability, n is the number of numbers in the set, and N is the range of possible values for each number.

5. Are there any tools or software available for computing the probability of a set of numbers being random?

Yes, there are various statistical software programs and online calculators that can compute the probability of a set of numbers being random. Some popular options include R, MATLAB, and Wolfram Alpha. These tools use advanced algorithms and statistical methods to determine the probability based on the given set of numbers.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
478
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
15
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
309
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
922
  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
1K
Back
Top