# I The statistics of 'psychic challenge'

1. Nov 21, 2016

### lavoisier

This is a problem that I thought I 'solved' many years ago.
In actual fact there are many things about it that are not clear to me, and I would like to hear your opinion, please.

Very briefly, there was this TV programme where a (supposedly psychic) guy had to match 5 (husband-wife) couples, obviously without knowing anything about them.
My interpretation was: if we have N distinct objects (say, the letters A, B, C, D, E) that must be placed in one specific order, how likely are we to place k = 0, 1, 2 ... N of them correctly by choosing the order randomly?

I was pretty sure the problem had long been studied, but I wanted to have some fun figuring it out for myself.
I found quite easily the probability to guess at least k specific couples correctly, regardless of what happens to the other couples (that's (N-k)!/N!, I believe, in this case with N=5).
However, the original problem was to find the probability P(k,N) to guess k (no matter which) couples and not the remaining N-k. Things got a bit tougher then, at least for my limited maths skills, and sums with factorials and alternating signs started appearing, but in the end I seemed to find a formula that accounted for the explicitly enumerated cases:

$P(k,N) = \frac{1}{k!} \sum_{i=0}^{N-k}\frac{(-1)^{i}}{i!}$

One nice feature of this result is that it correctly tells you that P(N-1,N)=0 (it's impossible to guess 4 couples right and not the 5th one). And it also tells you that for odd values of N, it is more likely to guess 1 couple right than 0 right, meaning perhaps you're more of a 'psychic' if you get them all wrong than if you guess 1 right(!).

Another nice thing about it (which I can't prove, but so far has worked numerically) is that it sums up to 1:

$\sum_{k=0}^{N}P(k,N) = 1$

As required, I believe, because the events for k = 0, 1, 2... N are mutually exclusive and, taken together, they constitute the whole set if possible outcomes of this experiment, so the probability that at least one of them happens must be 1.

What I wanted to know next (and I did this only a few days ago, years after finding the above formula) was the expected value, i.e. if we did a large number of trials, what would be the average number of correctly guessed couples?
Somewhat to my surprise, I found that, at least for the values of N I tried (2, 3, 4, 5, 10, 20}, it's always 1:

$\sum_{k=0}^{N}k \cdot P(k,N) = 1$

as in: regardless of how many couples there are, on average you're likely to guess only 1 right if you choose at random.
I plotted the probabilities for the above numerical cases, and it seems that, after some oscillations for N=2, 3 and 4, already from N=5 the curve gets very close to a limiting case where the probabilities of k = 0 or 1 are always the highest and both very close to 0.36788 (if anything is special about this number, I don't know), and the rest of the curve .

Now, I'm no mathematician - it may be that all of the above is particularly obvious to an expert. But it's not to me.
So here are my questions/doubts:

1. when I found the formula I wanted to check if it was correct, but I could find no website describing it - does anybody know if this kind of problem has a particular name?
2. can the formula be reduced to a closed form?
3. is there a way to prove that the probabilities for k = 0 ... N sum up to 1, and that the expected value is independent of N and is also 1?
4. if we wanted to test whether a guy is a psychic (OK, nobody is, but bear with me), what statistical method would we use? Chi squared, based on the expected and observed number of successes in NT trials?

Thank you
L

2. Nov 21, 2016

### lavoisier

Just noticed, 0.36788 is close to 1/e !
Now I'm even more intrigued...

3. Nov 23, 2016

### jk22

Interesting problem. I tried with mathematica it gives a closed form expression for sum of (-1)^n/n! But its a recuresively defined function.

However $$exp (-1)=\sum_{i=0}^\infty \frac {(-1)^i}{i!}$$ and its a rapidly converging series

Last edited: Nov 23, 2016
4. Nov 24, 2016

### lavoisier

Thank you @jk22.
That explains indeed why for high values of N the curves get all very similar: the first two probabilities k=0 and k=1 are close to 1/e, and the terms for k>=2 approach some value dominated by 1/k! and the first few terms of the sum.

P(k,10) is there but it's overlapping so well with P(k,20) that we don't see it!

Does mathematica say anything about the expected value summing up to 1?

5. Nov 24, 2016

### jk22

Mathematica can do only numerically for a given value of N but does not prove for all N.

6. Nov 25, 2016

### lavoisier

OK, thank you.
So it's not as obvious as it sounds.
Have a good weekend!