# Recreational Maths Problem.

• uart

#### uart

Heres the basic problem definition.You have a list of n items indexed 1..n, and you can navigate through the list in two different ways. One in a linear fashion by stepping "previous" or "next" or alternatively in a random fashion where the next item is chosen as a uniformly distributed (1..n) random number. Now starting from some random position you need to get to position number one (top) of the list. If the starting point is not already close to position-one then you're best off to make random steps until you're somewhere close to the top and then switch to the linear stepping to finish off. So the question is, where exactly is the optimal point to change strategy. That is, how close item number-one should you attempt to get using "random" steps before it's optimal to change to linear steps.

BTW. Assume that the approach to unity is unidirectional. That is, that the list does not "wrap around".

This problem actually came up a fews days ago when I was lying on the lounge listening to mp3's on my stereo. I was listening at about song 700 in a huge playlist of about 1400 songs when I suddenly decided I wanted to seek to the top of the list (where I'd recently added some new stuff that I now wanted to listen to). Now here's the embarrassing part, I was too lazy to get up and use the keyboard/mouse so I decide to do it using just the remote controller, which can do only "prev", "next" or "random". I thought here's an interesting maths problem. :)

I've solved it but my solution is a little bit heuristic, meaning that I think it's probably near optimal but my derivation is not exactly rigorous enough to be sure it really is 100% optimal. I was wondering what solutions others might make of this. The solution should be in the form of a function (possibly implicit, as in the solution to a given equation) giving the optimal threshold (position in the list) at which to change strategy as a function of the list size n.

Any takers?

Last edited:
Cool problem (love the inspiration :tongue2:); it actually seems pushing the boundaries of recreational math -- doesn't it feel like that's close to something useful?

"Clearly", the longer the list the worse a jump is at a given position from the target. If you won't jump at position 3 with length 3, you won't jump there if the list is longer than 3 (it only gets worse).

If you have a list of length 2, either you're done or a step will make you done; jumping won't help. If you have a list of length 3, and you're at position 3, random jumps are no better (on average) than steps, so you might as well step. So you never jump at less than position 4.

If you have a list of length 4, and you're at position 4, a random jump can put you 0, 1, or 2, steps away (since positions 1-3 never jump), or it can leave you in the same position. Expected number of turns:
$$t_4=\frac14(t_1+t_2+t_3+t_4)+1=\frac14(0+1+2+t_4)+1=\frac74+\frac14t_4$$
so $\frac34t_4=\frac74$ and thus $t_4=\frac73$. This is better than taking 3 steps to finish, so we jump at position 4 and step below that.

As you increase the length of the list, the position at which jumping is the best option grows, though probably fairly slowly. This shouldn't be too hard for a computer to solve optimally up to the millions, since the strategy is just a single (small) integer, below which stepping is better than jumping. The jumping positions don't need to be considered individually, since they are all treated identically.

That's interesting CRG. You approached the problem from the opposite angle that I did. I was looking more for an asymptotic solution using a "large n" assumption.

My very first instinct was to look at the expected value of the next position in each case (linear versus random) but it immediately became obvious that considering just one step at a time was way too restrictive for the random seek case. I then decided to look at it from the following perspective. Given some initial starting point, does there exist some number of steps for which the random seeking method has a good chance of, at some point during one of those steps, producing a lower number than the final seek value of the linear method.

That is, given an initial starting position of "m" in the list of size "n" we know that the final position after some number (x) of linear steps is m-x. So how good is the probability of producing a result of less then "m-x" at some point during "x' random seeks?

The probability of not getting any values less then m-x during "x" random seeks is (1-(m-x)/n)^x so it seemed reasonable (though a little heuristic) to set the threshold value for m (the point beyond which we change strategy) to be where the minimum value of (1-(m-x)/n)^x is less than 1/2. Asymptotically this function is minimised when x = m/2 so in my first attempt at this problem I decided that the best point to switch was the value of m that solved (1-x/n)^x = 0.5, with m=2x. This equation is transcendental but I was able to get a good asymptotic solution (using the usual approximations for small numbers like 1/(1-a) ~= 1+a and log(1+a) ~= a etc) of,

$$m \simeq 2 \sqrt{\log(2)} \, \sqrt{x} \simeq 1.67 \, \sqrt{n}$$.

I've since found a better solution that is less heuristic and actually more straight forward. You can show that the expected value of the minimum in a set of k independent uniform(0..1) continuous random variables is1/(k+1). So for large n the expected value of the minimum value of k discrete uniform(1..n) random variables will be approx n/(k+1). From this we can form an inequality that relates the expected best value during a set of "x" random steps versus the final value after "x" linear steps. That is,

n/(x+1) < m-x

x^2 - (m-1)x + n-m < 0

The smallest value of m for which we have a solution is when 4(n-m) = (m-1)^2 which has the asymptotic solution of :

$$m \simeq 2 \sqrt{n}$$

Last edited:
Ok I feel a bit dumb about that long winded derivation now that I've finally thought about it and got a nice clean result.

The thing I was missing was the expected length of a sequence of uniform (discrete 1..n) random variables starting at a random point and ending at "m" or less. Initially I wasn't sure about this but I've checked and the expected value of this sequence length is just n/m.

So if we set the threshold for swapping to linear steps at some number "m" then the total expected sequence length is just m-1 + n/m. (That is, n/m for the random part and then m-1 linear steps to get from m back to one.)

So all that's needed is to minimize f(m) = m - 1 + n/m which is pretty trivial and results in :

$$m= \sqrt{n}$$

Finally this is a result that I trust as being correct.

EDIT : ARRRRRHHHH GRRRRRRR that's not correct either. The expected sequence length to get to "m" or less is n/m but we don't necessarily end right on "m" which I seem to have assumed in the above. It seems a better assumption would be that if we set the threshold to stop the random seeks at "m" then we would actually expect the final value to be more like m/2 (because the random stopping point is anything m or less). So now I'm thinking that I should be minimizing something like f(m)=m/2 + n/m which would result in :

$$m = \sqrt{2} \, \sqrt{n}$$

Last edited:
I pressed my "small value" approach further, and I have a winner. (My 'proof' is scrawled across one piece and several scraps of paper; I'll just outline here.)

Suppose there are n different values and that at x or below, you simply step toward the solution; above x, you jump. The expected number of turns needed at one of the 'jump' points is
$$V(n,x)=\frac{x-1}{2}+\frac nx$$

Some basic calculations give the expected number of jumps for a given n and x as
$$E(n,x)=\frac{x-3}{2}+\frac nx$$
(unusually nice cancellation!)

Thus
$$\frac{\del}{\del x}E(n,x)=\frac12-\frac{n}{x^2}=0$$
and so the 'analytic' solution is
$$x*=\sqrt{2n}$$

Some annoying calculations show that, as expected,
$$x=\lfloor\sqrt{2n}+1/2\rfloor$$
is generally the best integer choice. The 'chance' that this is wrong (for large n) is bounded above by $1/\sqrt{32n}$; if it is wrong, then the best discrete choice for x is instead $$x=\lfloor\sqrt{2n}+1\rfloor.$$

If I was a little more awake, I'd try to check this solution numerically and justify my steps. But I'm exhausted, so I shan't. ;) I do encourage you to double-check me; I may have made some silly mistakes. The examples I worked out (2, 3, 4, and 5) matched.

Last edited:
ARRRRRHHHH GRRRRRRR that's not correct either. The expected sequence length to get to "m" or less is n/m but we don't necessarily end right on "m" which I seem to have assumed in the above. It seems a better assumption would be that if we set the threshold to stop the random seeks at "m" then we would actually expect the final value to be more like m/2 (because the random stopping point is anything m or less). So now I'm thinking that I should be minimizing something like f(m)=m/2 + n/m which would result in :

$$m = \sqrt{2} \, \sqrt{n}$$

Heh, bingo. Now our answers match.

I haven't read over CRGreathouse's proof, so I can't comment on that, but:

uart: I'm really tired right now, so I can't explain my logic, but you're trying to find for what value of m
$$\lim_{x \to \infty}(1-\frac{m-x}{n})^x < 1/2$$ is true
But shouldn't you instead be trying to find m for this:
$$\lim_{x \to \infty}(x-\frac{m-x}{n})^x < 1/2$$

umm maybe I've misunderstood something but you initially stated that the jump distance is randomly chosen from (1,2,...,n). in which case the jump distance is always greater than or equal to 1 which makes it always better to jump randomly.

Heh, bingo. Now our answers match.
Yes I finally cracked it (after an embarrassingly large number of failed attempts ).

I'm pretty confident that the sqrt(2n) solution is correct. I just wrote a really quick and dirty simulation to test it. It shows, for the case of list size n=1600, the mean total seek length versus the threshold "m", for m = 20 to 100 with the mean taken over 1000000 runs for each value of m. (See attachment).

#### Attachments

• seek.png
10.6 KB · Views: 385
umm maybe I've misunderstood something but you initially stated that the jump distance is randomly chosen from (1,2,...,n). in which case the jump distance is always greater than or equal to 1 which makes it always better to jump randomly.

Not if the random jump is likely to take you farther away from the desired point.

I love these "real world" math problems, and enjoyed skimming through this thread even if I didn't read it in every detail.

Yes I finally cracked it (after an embarrassingly large number of failed attempts ).

I'm pretty confident that the sqrt(2n) solution is correct. I just wrote a really quick and dirty simulation to test it. It shows, for the case of list size n=1600, the mean total seek length versus the threshold "m", for m = 20 to 100 with the mean taken over 1000000 runs for each value of m. (See attachment).

Yes, as in my post #5 the solution is almost always
$$x=\lfloor\sqrt{2n}+0.5\rfloor$$
with unusual exceptions of
$$x=\lfloor\sqrt{2n}+1\rfloor$$

Here is my (arguably easier solution):

Let the breaking poing be $$x*$$. The list has $$n$$ terms. Let the expected number of steps for a certain position $$x$$ be $$E(x)$$.

For $$x\leq x*$$, $$E(x)=x-1$$.
For every $$x>x*$$, $$E(x)$$ is constant (=$$e$$) (new position independent of old).

So for $$x>x*$$,
$$E(x) = e = \frac{1}{n}\sum_{x=1}^{x*}(x-1) + \frac{1}{n}\sum_{x*+1}^ne + 1= \frac{1}{2n}(x*+1)x* + \frac{1}{n}e(n-x*) + 1$$.

Solving for $$e$$ gives us:
$$e = \frac{x*(x*-1)}{2*x*} + \frac{n}{x*}$$.

This function has a minimum for $$x* = \sqrt{2n}$$.

Here is my (arguably easier solution):

Let the breaking poing be x*. The list has n terms. Let the expected number of steps for a certain position x be E(x).

For $$x\leq x^*$$, $$E(x)=x-1$$.
For every $$x>x*$$, $$E(x)$$ is constant (=$$e$$) (new position independent of old).

So for x>x*,
$$E(x) = e = \frac{1}{n}\sum_{x=1}^{x^*}(x-1) + \frac{1}{n}\sum_{x^*+1}^ne + 1= \frac{1}{2n}(x^*+1)x^* + \frac{1}{n}e(n-x^*) + 1$$.

Solving for e gives us:
$$e = \frac{x^*(x^*-1)}{2x^*} + \frac{n}{x^*}$$.

This function has a minimum for $$x^* = \sqrt{2n}$$.

Yes, that's the proof I used (almost down to the same symbols!).