# Number of guesses in a game

1. Oct 15, 2014

### chingel

One number is chosen from 1 to n, n is large, and I want to guess that number. If I get to choose one number and then I'm told if the number is bigger or smaller, then the best strategy is to try to halve the range at every step and then the average number of guesses would be roughly $log_2(n)$.

But what if instead of cutting the set in half at each step, I would always try to cut it in pieces of 1/4 and 3/4, how many steps on average would I take? And furthermore, if I cut it into m pieces of some fixed ratios at each step, how many steps on average would I take?

If I cut them into m equal pieces at every step, the average guesses is about $log_m(n)$, but what about unequal pieces?

If it makes it easier to analyze, we can say that I don't pick numbers, I just split the sets and I get told in which set the number is and we can assume n is very large.

2. Oct 15, 2014

### jbriggs444

Information Theory deals nicely with this sort of question. http://en.wikipedia.org/wiki/Information_theory

If you get lucky with your 1/4 guess, that yields as much information as two 50/50 questions would have yielded. So with 0.25 probability you get 2 bits of information. But if you get unlucky, you have only reduced the search space by 25 percent. That is less than what one 50/50 question would have yielded -- less than one bit of information.

How much less? Well, if you guessed unlucky twice in a row, that would reduce the search space to 0.75 * 0.75 = 0.5625 That's still not as much as a single bit of information. If you guessed badly three times, that would be reduce it by 0.75 * 0.75 * 0.75 = 0.421875. That's more than a single bit of information. So it takes somewhere between 2 and 3 unlucky guesses to equal one bit of information.

You can extend this and reason that the reduction in search space by a series of n unlucky guesses is $0.75^n$ while the reduction in search space by a series of m 50/50 guesses is $0.50^m$. For the reductions to be equal then there would have to be a specific ratio between m and n -- each unlucky guess is worth $-log_2 0.75$ bits. Approximately 0.415. As expected, it takes between 2 and 3 such guesses to get a single bit of information.

The same formula works out for a lucky guess as well. The number of bits of information conveyed by a lucky guess is $-log_2 0.25$. That's 2 bits.

This is the basis of Claude Shannon's measure of information. The information conveyed, on average, by each guess is taken as the sum over all possible outcomes of the probability of the outcome multiplied by the negative log of that same probability. In the 25/75 case, that is $-0.25 log_2 0.25 + -0.75 log_2 0.75$ which comes to about 0.915 bits per guess.

The information in the answer is the $\frac{1}{n} log_2 \frac{1}{n}$ added up over all n different possibilities for a total that is simply $log_2 n$.

So the average number of guesses is $\frac{log_2 n}{0.915}$ with the 25/75 approach.

3. Oct 17, 2014

### chingel

Thanks for the reply, the information theory seems to address exactly this sort of question. I haven't read about the theory before and it definitely is new and interesting, and I'll be reading more about it to get a better sense of why is information content the logarithm and why can we just take the average of the logarithms. There seems to be plenty of information on the internet about it, I'll start digging in.

I take it that for splicing it into m sets of proportions $a_i$ at every step, the average number of guesses will be $\frac{-log(n)}{\sum a_i log(a_i)}$

4. Oct 17, 2014

### jbriggs444

I agree with that formula. Average number of guesses = total information gained divided by average information gained per guess.

5. Oct 25, 2014

### chingel

A similar way is to notice that the logarithm of the length of the set decreases linearly. So we want to get a total of $\log(n)$, step sizes are $-\log(a_i)$ with probability $a_i$. It does make intuitive sense that the average number of steps for large n would be the total divided by the average gained on each step, but is there a proof for that?