An Algorithm Challenge (finding the five fastest horses out of a large group)

Vanadium 50 · Jun 18, 2023

PeroK said:

That's because you can find one light item from 7 statistically faster than one light item from 6.

Not sure why 6 and 7 needed to be in LaTex, but...

At this point, you have a pile of coins you know not to be the one. So if 7 is faster than 6 (I haeven't checked) you can speed up a 6 by tossing in one of the normal coints into the pile. Now you have 7.

PeroK · Jun 19, 2023

Vanadium 50 said:

Not sure why 6 and 7 needed to be in LaTex, but...

At this point, you have a pile of coins you know not to be the one. So if 7 is faster than 6 (I haeven't checked) you can speed up a 6 by tossing in one of the normal coints into the pile. Now you have 7.

If there is an additional coin that we know is normal, then the algorithm can be improved for six coins. We put the known normal coin along with two others into one side of the balance. This gives a slighty greater probability (##\frac 5 6##) of the light coin being left out and identified in one measurement.

The expected number of measurements are: ##2##, ##\frac {13} 7## and ##\frac {11} 6## for six, seven and six plus one known normal coin respectively.

This technique, however, does not quite improve the algorithm enough for the case of 18 coins:

Using the basic 6-6 + 6 idea gives an expected ##3## measurements

Using the extra coin when we have 6 gives an expected ##\frac {17} 6## measurements.

But, the 7-7 + 4 technique is still better with an expected ##\frac {50}{18}## measurements.

pinball1970 · Jun 19, 2023

I'm looking at the horse one.

For the horse puzzle.

5 x groups of 5, A-E. Group A, horses 1-5, Group B, 6-10 and so on.

All 5 groups race, produce a winner who then race in a winner of winners (WOW) and produce the fastest horse (Group A, horse number 1 say), who is now eliminated.

I then want to put the number twos from the heats against the runners up of the WOW.

Problem I see is if I brute force it that way, I will end up with a lot of races where horses have already raced each other.

I have to go three deep in case heat A had the three fastest.
I would not know that till I put that horse in with WOW runners up.

Vanadium 50 · Jun 19, 2023

pinball1970 said:

5 x groups of 5, A-E. Group A, horses 1-5, Group B, 6-10 and so on.

Let's label it a little differently. In the Race of Winners, label the fastest horse A1, the next fastest B1, and so on. Now look at the prelininaries: label the second fastest horse in the race with A in it A2, the third fastest A3, and so on. Label the second fastest horse in the race with B in it B2, the third fastest B3, and so on.

There is nothing magical about this labeling; it's just convenient.

Can A1 be the 2nd fastest? No...it's the fastest. Can E5?

pinball1970 · Jun 19, 2023

I didnt read it or did not take it in. I am trying from scratch, its on the net too (I checcked)

pinball1970 · Jun 19, 2023

PeroK said:

That's the idea. You can reduce the number by (approx) a factor of 3 each time.

It's then a question of what criteria you use to assess your strategy. The maximum number of measurements can be calculated using powers of three (or the inverse process: log to the base ##3##).

If, instead, we look at the expected (average) number of measurements, then it gets suprisingly tricky. The example I gave above is that for ##18## items, you can do better than measuring ##6## against ##6##. That's because you can find one light item from ##7## statistically faster than one light item from ##6##. To be honest, I didn't spot that until I wrote a Python script to check all the possibilities!

A quick up date, I am looking at the horse one but I had an idea with this.

from 96 to 3 x32 then 2x10 and 12 followed by 3x4 and I am left with 4.

I then just weight two, I have a one in 4 chance of getting it 4 weighs and definitely can do it in 5.

Ibix · Jun 19, 2023

pinball1970 said:

Problem I see is if I brute force it that way, I will end up with a lot of races where horses have already raced each other.

Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?

Frabjous · Jun 19, 2023

pinball1970 said:

I would not know that till I put that horse in with WOW runners up.

@Ibix solved it in post #2. The observation is that the WOW race ranking eliminates additional horses from the the initial races of the horses that did not win the WOW race. You are left with 5 horses for the final 2 slots in the 7th race.

pinball1970 · Jun 19, 2023

Vanadium 50 said:

Let's label it a little differently. In the Race of Winners, label the fastest horse A1, the next fastest B1, and so on. Now look at the prelininaries: label the second fastest horse in the race with A in it A2, the third fastest A3, and so on. Label the second fastest horse in the race with B in it B2, the third fastest B3, and so on.

There is nothing magical about this labeling; it's just convenient.

Can A1 be the 2nd fastest? No...it's the fastest. Can E5?

Ibix said:

Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?

and @Vanadium 50

Only A2 and B1 can be number two. A1 cannot as it is fastest

Assume A2 wins that.

Now only B1, C1 or A3 can be 3rd.

Vanadium 50 · Jun 19, 2023

OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?

pinball1970 · Jun 19, 2023

Vanadium 50 said:

OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?

Damn it!

A2/A3 can 2 or 3, B1 also, C1, B2 or A3 can be 3rd.

One race of A2, A3, B1, B2 and C1 should give final positions.

I think!

Vanadium 50 · Jun 19, 2023

For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?

pinball1970 · Jun 19, 2023

Vanadium 50 said:

For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?

Do you mean once I have ran the heats and winners of those to get the fastest?

Vanadium 50 · Jun 19, 2023

I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?

PeroK · Jun 19, 2023

Vanadium 50 said:

I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?

If I understand the question, you can model this by considering the numbers 1-25 put at random into five groups of five. The numbers in each group are then sorted in ascending sequence. We then choose group A as the group with the number 1 in it. A1 is the number 1 and A5 is the highest number in that group. Group B is the group with the next lowest first number. E.g., if the number 2 is not in group A, then group B will be the group with the number 2 in it. Group E is the group with the highest first number.

It's clear than the distribution for E5 has a higher mean than the distribution for A5.

This could be simulated using a Python (or C++) script, or possibly calculated - if it wasn't so late in the day.

PS the lower the number, the faster the horse it represents.

Vanadium 50 · Jun 19, 2023

PeroK said:

It's clear than the distribution for E5 has a higher mean than the distribution for A5.

i.e. a slower horse.

I agree it is true, but why is it clear? How can the relative speeds of two other horses make a difference here?

The direction I am going here is that, while 7 races is the answer, it looks like we have no information on the 4th fastest horse,m or the 11th fastest horse, or the slowest horse, when it fact we do.

PeroK · Jun 19, 2023

Vanadium 50 said:

i.e. a slower horse.

I agree it is true, but why is it clear?

Note that the 16th fastest horse, for example, could equally be in any group; but, it's more likely to be the slowest horse in group A than the slowest horse in group E. It might even be the fastest horse in group E.

By definition A1 is lower than B1 is lower than C1 etc. Statistically, this ordering must persist for the set A2-E2 etc. It's not possible for E2 to have the same distribution as A2. E2 must be at least the number 6; whereas, A2 could be in the range 2-5. And, so on to the set A5-E5. E5 must be higher on average than A5.

PS as a very rough estimate, I'd say that E1 has an average value of 10-12. I'll do a simulation tomorrow if I get the chance.

Frabjous · Jun 19, 2023

I’ll be the idiot.
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
The sixth and seventh race only involve horses with zero probability of being the slowest. Given that we have learned nothing new about the slowest, the probabilities do not change.
What am I doing wrong?

Ibix · Jun 19, 2023

Frabjous said:

After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.

You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.

Frabjous · Jun 19, 2023

Ibix said:

You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.

Got it. Thanks.

Vanadium 50 · Jun 19, 2023

@PeroK I don't follow your reasoning, but we might be saying the same thing.

I know there are four horses faster than A5: A1, A2, A3 and A4.
I know that there are eight horses faster than E5: A1, B1, C1, D1, E1, E2, E3 and E4.
So on average, A5 is faster than E5.

Ready for the extra extra credit?

Running a similation on a computer may be harder than it appears. There are a lot of ways to place 25 horses in 25 slots (liike 25!) so you need to worry about uniformity of sampling.

Instead I have been thinking about average ranking. If you consider a two lane track and 4 horses, B1 is slightly faster (n average) than A2. The reason is that being the slowest horse is a possibility for A2 but not for B1. Another way to look at it is that, going back to the original problem, B1 is known to be faster than 4 other horses, but A2 is only known to be faster than 3.

Frabjous · Jun 19, 2023

Edited: Wrong answer

Vanadium 50 · Jun 19, 2023

I think you have to be verty careful about "equal probability". Certain cases have been eliminatedl

For example, E5 cannot be 9 if E4 is 10.

Similarly, B1 is likely to be faster than A2. To be B1 you need to beat 19 horses. To be A2 you only need to beat 3.

Frabjous · Jun 19, 2023

Vanadium 50 said:

I think you have to be verty careful about "equal probability". Certain cases have been sliminated.

For example, E5 cannot be 9 if E4 is 10.

Similarly, B1 is likely to be faster than A2. To be B1 you need to beat 19 horses. To be A2 you only need to beat 3.

Agreed. My answer is wrong.

PeroK · Jun 20, 2023

Frabjous said:

I’ll be the idiot.
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
The sixth and seventh race only involve horses with zero probability of being the slowest. Given that we have learned nothing new about the slowest, the probabilities do not change.
What am I doing wrong?

One way to look at this is to place the fastest horse in a group at random, then place the slowest horse in a group at random. There is, therefore, a probability of ##\frac 4 {24} = \frac 1 6## that the fastest and slowest horses are in the same group. This is less than the naive ##\frac 1 5##. That means that the probability that A5 is the slowest horse is ##\frac 1 6##.

Note that if you try the same trick by placing the second fastest horse first, then there is also a probability of ##\frac 1 6## that the second fastest and slowest horses are in the same group. But, there is also a probability that the fastest horse in is that group, so that the slowest horse is not necessarily B5 in this case. Also, that's not the only way that B5 will be the slowest horse. If the two fastest horses are in the same group, then B1 might be the third fastest horse etc.

PeroK · Jun 20, 2023

Vanadium 50 said:

@PeroK I don't follow your reasoning, but we might be saying the same thing.

I know there are four horses faster than A5: A1, A2, A3 and A4.
I know that there are eight horses faster than E5: A1, B1, C1, D1, E1, E2, E3 and E4.
So on average, A5 is faster than E5.

Ready for the extra extra credit?

Running a similation on a computer may be harder than it appears. There are a lot of ways to place 25 horses in 25 slots (liike 25!) so you need to worry about uniformity of sampling.

The idea is that the original draw into five groups is uniformly random. We can then calculate various probabilities, such as the probability that the fastest and slowest horses are in the same race etc.

The initial set of five races effectively sorts the horses in each group. That gives us no new information about the distribution. Likewise the WOW race simply sorts the groups. We can't yet tell anything about the initial random distribution of horses from this. Each initial random distribution is still equally likely. All this has done is label the horses A1-E5. I was planning to calculate the average rank for each horse from this. That's a straightforward exercise. I could probably do a sample size of 1-10 million, but not all ##25!##.

The result of the playoff race between A2, A3, B1, B2, C1 will eliminate some of the initial possible distributions. But, there are many possibilities for the result of this race. If we pick a single possible outcome, such as B1-A2-C1-A3-B2, then we can recalculate the various probabilities, based on this knowledge of the distribution. I wasn't intending to do this.

PeroK · Jun 20, 2023

PS to show the idea, let's assume only four horses and two races with two horses in each. There are only three equally likely possibilities for the initial distribution. Let's label the horses from fastest to slowest F1-F4:

Group A: F1, F2
Group B: F3, F4

Group A: F1, F3
Group B: F2, F4

Group A: F1, F4
Group B: F2, F3

We can see immediately why the slowest horse (F4) is more likely to be in group B than group A. And, we can calculate the average rank of horses A1-B2:

A1 = 1, A2 = 3, B1 = 7/3, B2 = 11/3

This shows that, indeed, B2 is on average slower than A2. And, also, on average B1 is faster than A2.

It also raises the question of whether I should simulate a head-to-head race between A5 and E5; or, as I was intending, calculate the average rank of each horse in the range 1-25. I might even be able to do the former analytically.

PeroK · Jun 20, 2023

PPS I did a quick calculation for nine horses in three races. Horse A3 is faster than C3 with probability ##\frac {181}{280}##. In the case of four horses above, the probability that A2 is faster than B2 is ##\frac 2 3##.

It doesn't look like there is a nice formula to extend to twenty-five horses, so I'll have to resort to Python.

PeroK · Jun 20, 2023

Here's the result of a simulation of 10 million random draws of 25 horses.

A5 is faster than E5 0.61
Average rank for group A [1.0, 6.0, 11.0, 16.0, 21.0]
Average rank for group B [2.19, 6.95, 11.72, 16.48, 21.24]
Average rank for group C [3.68, 8.14, 12.61, 17.07, 21.53]
Average rank for group D [5.71, 9.76, 13.82, 17.88, 21.94]
Average rank for group E [9.09, 12.47, 15.85, 19.24, 22.62]

As predicted, A5 is faster than E5 most of the time (about 61%). That said, there's not a lot to choose between the horses that finish last in each race!

Before running the play-off race, the most likely top three horses are the first three in the WOW race: A1, B1 and C1. It's quite easy to work out that B1 is usually the second fastest horse (with probability ##\frac 5 6##).

Very much as predicted, although it's good to see the numbers come out!

PeroK · Jun 20, 2023

Here's the Python code:

Python:

import random
#
# Choose the number of times to run the simulation
num_sim = 10_000_000
#
# Count the number of times A5 beats E5
a5_win = 0
#
# g_list keeps a tally of relative rank of horses A1-A5, B1-B5 etc.
g_list=[[0,0,0,0,0], [0,0,0,0,0], [0,0,0,0,0], [0,0,0,0,0],[0,0,0,0,0]]
#
for k in range(0, num_sim):
    horses = list(range(1,26))
    random.shuffle(horses)
#
#    Split the random selection into the five groups (first races)
#
    g_0 = [horses[0],horses[1],horses[2],horses[3],horses[4]]
    g_1 = [horses[5],horses[6],horses[7],horses[8],horses[9]]
    g_2 = [horses[10],horses[11],horses[12],horses[13],horses[14]]
    g_3 = [horses[15],horses[16],horses[17],horses[18],horses[19]]
    g_4 = [horses[20],horses[21],horses[22],horses[23],horses[24]]
#
# sorting each group equates to running the first set of races
#
    g_0.sort()
    g_1.sort()
    g_2.sort()
    g_3.sort()
    g_4.sort()
#
# sorting the list of lists equates to running the WOW race (by default Python sorts by the first nunber in each list)
    g_temp = [g_0,g_1, g_2, g_3, g_4]
#
    g_temp.sort()
#
#    Check whether A5 is faster than E5)
#
    if g_temp[0][4] < g_temp[4][4]:
        a5_win += 1
#
# Add the rankings for this race to the overall tally:
    for i in range(0, 5):
        for j in range(0, 5):
            g_list[i][j] += g_temp[i][j]
#
# Calculate the average rank for each horse:
for i in range(0, 5):
        for j in range(0, 5):
            g_list[i][j] = round(g_list[i][j]/num_sim,2)
#
print("A5 is faster than E5 {:.2f}".format(a5_win/num_sim))
print("Average rank for group A", g_list[0])
print("Average rank for group B", g_list[1])
print("Average rank for group C", g_list[2])
print("Average rank for group D", g_list[3])
print("Average rank for group E", g_list[4])

An Algorithm Challenge (finding the five fastest horses out of a large group)

Similar threads

How to increase phone signal strength by lying about it

A Crisis for Newly Minted CompSci Majors -- entry level jobs gone

How to calculate Tension for a series of connected points?

Learning Assembly and computer architecture for x86

Sequential Analog Computers?

Insights Thinking Outside The Box Versus Knowing What’s In The Box

Insights Why Entangled Photon-Polarization Qubits Violate Bell’s Inequality

Insights Quantum Entanglement is a Kinematic Fact, not a Dynamical Effect

Insights What Exactly is Dirac’s Delta Function? - Insight

Insights Relativator (Circular Slide-Rule): Simulated with Desmos - Insight

Insights Fixing Things Which Can Go Wrong With Complex Numbers