Coto said:
This thread peaked my interest because of DrRocket's reply. I was wondering if you might go into more depth about the fallacy of applying probability theory to the digits of pi?
In the pseudo-random sense, can we still not say something about statistical distributions of the numbers? Wouldn't this be the surest way to test whether something is pseudo-random? If this is the case, why can't we say something about the probability of pi attaining the same digit multiple times?
If we make the assumption that pi is a normal number can we say something about whether multiple instances of a digit will appear? Is there a correct way to apply a probability to this? If not, perhaps you could give an analogy to clear up why it is not possible.
Thanks,
Coto
The problem with testing for "pseudo-randomness" is that there is no definition for "pseudo-random" and hence no test. Pseudo-random is a term that means roughly "we would like to call it random, but we have no definition for random, so we'll use pseudo-random instead".
Here is the basic problem. The term random comes up in probability theory. Probability theory starts with the hypothesis that one has a set, a sigma algebra of subsets, and a positive measure defined on the sigma algebra that measures the entire space as 1 -- thgis entire thing being called a probability space. A random variable is then just a measureable function on the probability space. Thus, following this formulation of Kolmogorove, probabiltiy theory is just a branch of the general theory of measure and integration.
So, probability theory starts with a probability space. It assumes that you mave the probability measure defined, and it annoints certain functins as "random" by virtue of their measurability (in the sense of abstract measure theory, which has nothing to do with being measurable in any physical sense using any sort of intstruments).
Pseudo-random is even more loosely defined. It is essentially meaningless mathematically and means generally "unpredictable" in some not-very-well-defined sense. A fairly typical pseudo-random generator starts calulating the digits of pi and assumes that that generates a sample path for a random variable of integers uniformly distributed on the interval [0.9]. There is no theoretical basis for this, but it more or less works in practice (since there is no way to actually check it, there is no way to reallyt dispute it either). What you get, which it looks "random", whatever that means, it also completely deterministic, and the algorithm will generate exactly the same "random sequence" of integers every time you run the algorithm.
Probability theory has some fairly deep theorems, notably the central limit theorem, and the law of large numbers. But to apply those theorems you need to first fulfill the hypotheses, and that requires that one have a probability space. The digits of pi do nor provide any sort of probability space on which to apply the basic theory. Thus you cannot reach conclusions by applying theorems the hypotheses for which are unfulfilled. It is just that simple.
There are no theorems that apply to pseudo-random variables, simply because there is no useful definition for the critters. You have something that is not quite a random variable so the best that you can do is apply something that is not quite a theorem -- in short you are whistling in the dark.
Now, you might be able to formulate some sort of probability question that intuitively relates to some similar question about pi. But that will not directly answer the questions one one like to answer about pi specifically.
Sometimes a suggestive probabilistic problem is formulated and answered along the road to answering the question that everyone would like to have answered. That is sometimes useful and sometimes not. Such things have been tried with the Riemann Hypothesis for instance, but don't receive a lot of press because they really don't answer the basic question.
The problem is that just because something is not known does not imply that one can apply the methods of probability to answering it. Some questions are simply not probabilistic in nature. It may be that 10% of the people in the U.S. are named "Bob" but that does not mean that there is a 10% probability that your name is Bob just because I don't know your name. Your name is either Bob or it is not. There is nothing random about it. What is true is that if I select a person at random from the population then I can expect to find a Bob about 1 time in 10 (on average), but that has nothing whatever to do with what your specific name is.
Similarly the digits of pi are whatever the digits of pi are and probability has nothing whatever to do with it.
A normal number has certain properties, by definition, concerning the occurence of digits as the number of digits becomes large. So, if you assume that pi is normal then, by assumption, the conclusions that would apply to any normal number will apply to pi. The rub is that you have no idea if your assumption is correct or not, and hence no idea whether the conclusions are valid or not.