An Algorithm Challenge (finding the five fastest horses out of a large group)

In summary, the strategy adopted by the OP is "wait for a flash of insight". A fine strategy, one I have used myself on many occasions.Yes, I understand that the strategy is a good one, but I think it could be improved.
  • #36
PeroK said:
That's the idea. You can reduce the number by (approx) a factor of 3 each time.

It's then a question of what criteria you use to assess your strategy. The maximum number of measurements can be calculated using powers of three (or the inverse process: log to the base ##3##).

If, instead, we look at the expected (average) number of measurements, then it gets suprisingly tricky. The example I gave above is that for ##18## items, you can do better than measuring ##6## against ##6##. That's because you can find one light item from ##7## statistically faster than one light item from ##6##. To be honest, I didn't spot that until I wrote a Python script to check all the possibilities!
A quick up date, I am looking at the horse one but I had an idea with this.

from 96 to 3 x32 then 2x10 and 12 followed by 3x4 and I am left with 4.

I then just weight two, I have a one in 4 chance of getting it 4 weighs and definitely can do it in 5.
 
  • Like
Likes PeroK
Technology news on Phys.org
  • #37
pinball1970 said:
Problem I see is if I brute force it that way, I will end up with a lot of races where horses have already raced each other.
Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?
 
  • #38
pinball1970 said:
I would not know that till I put that horse in with WOW runners up.
@Ibix solved it in post #2. The observation is that the WOW race ranking eliminates additional horses from the the initial races of the horses that did not win the WOW race. You are left with 5 horses for the final 2 slots in the 7th race.
 
  • #39
Vanadium 50 said:
Let's label it a little differently. In the Race of Winners, label the fastest horse A1, the next fastest B1, and so on. Now look at the prelininaries: label the second fastest horse in the race with A in it A2, the third fastest A3, and so on. Label the second fastest horse in the race with B in it B2, the third fastest B3, and so on.

There is nothing magical about this labeling; it's just convenient.

Can A1 be the 2nd fastest? No...it's the fastest. Can E5?

Ibix said:
Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?

and @Vanadium 50

Only A2 and B1 can be number two. A1 cannot as it is fastest

Assume A2 wins that.

Now only B1, C1 or A3 can be 3rd.
 
  • #40
OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?
 
  • Like
Likes PeroK
  • #41
Vanadium 50 said:
OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?
Damn it!

A2/A3 can 2 or 3, B1 also, C1, B2 or A3 can be 3rd.

One race of A2, A3, B1, B2 and C1 should give final positions.

I think!
 
  • Like
Likes Ibix, Vanadium 50 and PeroK
  • #42
For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?
 
Last edited:
  • #43
Vanadium 50 said:
For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?
Do you mean once I have ran the heats and winners of those to get the fastest?
 
  • #44
I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?
 
  • #45
Vanadium 50 said:
I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?
If I understand the question, you can model this by considering the numbers 1-25 put at random into five groups of five. The numbers in each group are then sorted in ascending sequence. We then choose group A as the group with the number 1 in it. A1 is the number 1 and A5 is the highest number in that group. Group B is the group with the next lowest first number. E.g., if the number 2 is not in group A, then group B will be the group with the number 2 in it. Group E is the group with the highest first number.

It's clear than the distribution for E5 has a higher mean than the distribution for A5.

This could be simulated using a Python (or C++) script, or possibly calculated - if it wasn't so late in the day.

PS the lower the number, the faster the horse it represents.
 
  • #46
PeroK said:
It's clear than the distribution for E5 has a higher mean than the distribution for A5.
i.e. a slower horse.

I agree it is true, but why is it clear? How can the relative speeds of two other horses make a difference here?

The direction I am going here is that, while 7 races is the answer, it looks like we have no information on the 4th fastest horse,m or the 11th fastest horse, or the slowest horse, when it fact we do.
 
  • #47
Vanadium 50 said:
i.e. a slower horse.

I agree it is true, but why is it clear?
Note that the 16th fastest horse, for example, could equally be in any group; but, it's more likely to be the slowest horse in group A than the slowest horse in group E. It might even be the fastest horse in group E.

By definition A1 is lower than B1 is lower than C1 etc. Statistically, this ordering must persist for the set A2-E2 etc. It's not possible for E2 to have the same distribution as A2. E2 must be at least the number 6; whereas, A2 could be in the range 2-5. And, so on to the set A5-E5. E5 must be higher on average than A5.

PS as a very rough estimate, I'd say that E1 has an average value of 10-12. I'll do a simulation tomorrow if I get the chance.
 
  • #48
I’ll be the idiot.
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
The sixth and seventh race only involve horses with zero probability of being the slowest. Given that we have learned nothing new about the slowest, the probabilities do not change.
What am I doing wrong?
 
  • #49
Frabjous said:
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.
 
  • Like
Likes Frabjous
  • #50
Ibix said:
You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.
Got it. Thanks.
 
  • #51
@PeroK I don't follow your reasoning, but we might be saying the same thing.

I know there are four horses faster than A5: A1, A2, A3 and A4.
I know that there are eight horses faster than E5: A1, B1, C1, D1, E1, E2, E3 and E4.
So on average, A5 is faster than E5.

Ready for the extra extra credit?

Running a similation on a computer may be harder than it appears. There are a lot of ways to place 25 horses in 25 slots (liike 25!) so you need to worry about uniformity of sampling.

Instead I have been thinking about average ranking. If you consider a two lane track and 4 horses, B1 is slightly faster (n average) than A2. The reason is that being the slowest horse is a possibility for A2 but not for B1. Another way to look at it is that, going back to the original problem, B1 is known to be faster than 4 other horses, but A2 is only known to be faster than 3.
 
  • #52
Edited: Wrong answer
 
Last edited:
  • #53
I think you have to be verty careful about "equal probability". Certain cases have been eliminatedl

For example, E5 cannot be 9 if E4 is 10.

Similarly, B1 is likely to be faster than A2. To be B1 you need to beat 19 horses. To be A2 you only need to beat 3.
 
Last edited:
  • #54
Vanadium 50 said:
I think you have to be verty careful about "equal probability". Certain cases have been sliminated.

For example, E5 cannot be 9 if E4 is 10.

Similarly, B1 is likely to be faster than A2. To be B1 you need to beat 19 horses. To be A2 you only need to beat 3.
Agreed. My answer is wrong.
 
  • #55
Frabjous said:
I’ll be the idiot.
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
The sixth and seventh race only involve horses with zero probability of being the slowest. Given that we have learned nothing new about the slowest, the probabilities do not change.
What am I doing wrong?
One way to look at this is to place the fastest horse in a group at random, then place the slowest horse in a group at random. There is, therefore, a probability of ##\frac 4 {24} = \frac 1 6## that the fastest and slowest horses are in the same group. This is less than the naive ##\frac 1 5##. That means that the probability that A5 is the slowest horse is ##\frac 1 6##.

Note that if you try the same trick by placing the second fastest horse first, then there is also a probability of ##\frac 1 6## that the second fastest and slowest horses are in the same group. But, there is also a probability that the fastest horse in is that group, so that the slowest horse is not necessarily B5 in this case. Also, that's not the only way that B5 will be the slowest horse. If the two fastest horses are in the same group, then B1 might be the third fastest horse etc.
 
Last edited:
  • #56
Vanadium 50 said:
@PeroK I don't follow your reasoning, but we might be saying the same thing.

I know there are four horses faster than A5: A1, A2, A3 and A4.
I know that there are eight horses faster than E5: A1, B1, C1, D1, E1, E2, E3 and E4.
So on average, A5 is faster than E5.

Ready for the extra extra credit?

Running a similation on a computer may be harder than it appears. There are a lot of ways to place 25 horses in 25 slots (liike 25!) so you need to worry about uniformity of sampling.
The idea is that the original draw into five groups is uniformly random. We can then calculate various probabilities, such as the probability that the fastest and slowest horses are in the same race etc.

The initial set of five races effectively sorts the horses in each group. That gives us no new information about the distribution. Likewise the WOW race simply sorts the groups. We can't yet tell anything about the initial random distribution of horses from this. Each initial random distribution is still equally likely. All this has done is label the horses A1-E5. I was planning to calculate the average rank for each horse from this. That's a straightforward exercise. I could probably do a sample size of 1-10 million, but not all ##25!##.

The result of the playoff race between A2, A3, B1, B2, C1 will eliminate some of the initial possible distributions. But, there are many possibilities for the result of this race. If we pick a single possible outcome, such as B1-A2-C1-A3-B2, then we can recalculate the various probabilities, based on this knowledge of the distribution. I wasn't intending to do this.
 
Last edited:
  • #57
PS to show the idea, let's assume only four horses and two races with two horses in each. There are only three equally likely possibilities for the initial distribution. Let's label the horses from fastest to slowest F1-F4:

Group A: F1, F2
Group B: F3, F4

Group A: F1, F3
Group B: F2, F4

Group A: F1, F4
Group B: F2, F3

We can see immediately why the slowest horse (F4) is more likely to be in group B than group A. And, we can calculate the average rank of horses A1-B2:

A1 = 1, A2 = 3, B1 = 7/3, B2 = 11/3

This shows that, indeed, B2 is on average slower than A2. And, also, on average B1 is faster than A2.

It also raises the question of whether I should simulate a head-to-head race between A5 and E5; or, as I was intending, calculate the average rank of each horse in the range 1-25. I might even be able to do the former analytically.
 
  • #58
PPS I did a quick calculation for nine horses in three races. Horse A3 is faster than C3 with probability ##\frac {181}{280}##. In the case of four horses above, the probability that A2 is faster than B2 is ##\frac 2 3##.

It doesn't look like there is a nice formula to extend to twenty-five horses, so I'll have to resort to Python.
 
  • #59
Here's the result of a simulation of 10 million random draws of 25 horses.

A5 is faster than E5 0.61
Average rank for group A [1.0, 6.0, 11.0, 16.0, 21.0]
Average rank for group B [2.19, 6.95, 11.72, 16.48, 21.24]
Average rank for group C [3.68, 8.14, 12.61, 17.07, 21.53]
Average rank for group D [5.71, 9.76, 13.82, 17.88, 21.94]
Average rank for group E [9.09, 12.47, 15.85, 19.24, 22.62]

As predicted, A5 is faster than E5 most of the time (about 61%). That said, there's not a lot to choose between the horses that finish last in each race!

Before running the play-off race, the most likely top three horses are the first three in the WOW race: A1, B1 and C1. It's quite easy to work out that B1 is usually the second fastest horse (with probability ##\frac 5 6##).

Very much as predicted, although it's good to see the numbers come out!
 
  • Wow
Likes pinball1970
  • #60
Here's the Python code:
Python:
import random
#
# Choose the number of times to run the simulation
num_sim = 10_000_000
#
# Count the number of times A5 beats E5
a5_win = 0
#
# g_list keeps a tally of relative rank of horses A1-A5, B1-B5 etc.
g_list=[[0,0,0,0,0], [0,0,0,0,0], [0,0,0,0,0], [0,0,0,0,0],[0,0,0,0,0]]
#
for k in range(0, num_sim):
    horses = list(range(1,26))
    random.shuffle(horses)
#
#    Split the random selection into the five groups (first races)
#
    g_0 = [horses[0],horses[1],horses[2],horses[3],horses[4]]
    g_1 = [horses[5],horses[6],horses[7],horses[8],horses[9]]
    g_2 = [horses[10],horses[11],horses[12],horses[13],horses[14]]
    g_3 = [horses[15],horses[16],horses[17],horses[18],horses[19]]
    g_4 = [horses[20],horses[21],horses[22],horses[23],horses[24]]
#
# sorting each group equates to running the first set of races
#
    g_0.sort()
    g_1.sort()
    g_2.sort()
    g_3.sort()
    g_4.sort()
#
# sorting the list of lists equates to running the WOW race (by default Python sorts by the first nunber in each list)
    g_temp = [g_0,g_1, g_2, g_3, g_4]
#
    g_temp.sort()
#
#    Check whether A5 is faster than E5)
#
    if g_temp[0][4] < g_temp[4][4]:
        a5_win += 1
#
# Add the rankings for this race to the overall tally:
    for i in range(0, 5):
        for j in range(0, 5):
            g_list[i][j] += g_temp[i][j]
#
# Calculate the average rank for each horse:
for i in range(0, 5):
        for j in range(0, 5):
            g_list[i][j] = round(g_list[i][j]/num_sim,2)
#
print("A5 is faster than E5 {:.2f}".format(a5_win/num_sim))
print("Average rank for group A", g_list[0])
print("Average rank for group B", g_list[1])
print("Average rank for group C", g_list[2])
print("Average rank for group D", g_list[3])
print("Average rank for group E", g_list[4])
 
  • Like
  • Wow
Likes Vanadium 50 and pinball1970
  • #61
Here's the result of another simulation to calculate the probability of which is the slowest horse (after the WOW race):

A5 is the slowest horse: 0.1675
B5 is the slowest horse: 0.1753
C5 is the slowest horse: 0.1879
D5 is the slowest horse: 0.2086
E5 is the slowest horse: 0.2608

Note that he probability that A5 is the slowest horse is ##\frac 1 6## as previously calculated precisely.
 
  • #62
On the extra extra credit, it shows B4 is faster than A5, which I expected. Both lose to 4 other horses, but we know B4 is faster than one other horse - we don't know that for A5.

I was surprised that C3 was not 13.00.
 
  • #63
Been playing with a much more subtle problem: identify the median-speed horse.

One reason this is subtle is the condition "fewest races" - is this on average? Or the smallest maximum guranteed to find the answer? Or something else?
 
  • #64
Vanadium 50 said:
Been playing with a much more subtle problem: identify the median-speed horse.

One reason this is subtle is the condition "fewest races" - is this on average? Or the smallest maximum guranteed to find the answer? Or something else?
Dunno. I'll have to check with the sheik.
 
  • #65
PeroK said:
If I understand the question, you can model this by considering the numbers 1-25 put at random into five groups of five. The numbers in each group are then sorted in ascending sequence. We then choose group A as the group with the number 1 in it. A1 is the number 1 and A5 is the highest number in that group. Group B is the group with the next lowest first number. E.g., if the number 2 is not in group A, then group B will be the group with the number 2 in it. Group E is the group with the highest first number.

It's clear than the distribution for E5 has a higher mean than the distribution for A5.

This could be simulated using a Python (or C++) script, or possibly calculated - if it wasn't so late in the day.

PS the lower the number, the faster the horse it represents.
Though I no longer play the game of bridge, I frequent a group of bridge players. They are confronted with such situations all the time. The also have the constraint of solving the problem without any aids and doing so in a timely manner. The problem is quite similar as the number of cards in a hand is fixed and a higher rank in one hand means lower ranks in the other hands.

The way they would approach this is called "empty spaces." In the A group we know that one of the horses isn't the slowest. This leaves 4 empty spaces in group A. The chance that A contains the slowest is 4/24 = 1/6. In the E group we know that none of the horses is in the top four. The chance that E contains the slowest is about 5/(25-4). That's 24%, as opposed to the simulation value of 26%. Close enough for bridge.
 
  • #66
Hornbein said:
Dunno. I'll have to check with the sheik.
Sheik al-Abdullah said, "What may I ask is this "average" of which you are speaking?" When I tried to explain he replied, "I hear your words only as they rattle about in the deep well of time. The horses win or not win by the will of Allah, peace be upon him. So it is written." I elected not to pursue the matter further.
 
Last edited:

Similar threads

Replies
9
Views
1K
  • Math Proof Training and Practice
3
Replies
83
Views
17K
  • Math Proof Training and Practice
2
Replies
67
Views
10K
  • Programming and Computer Science
Replies
29
Views
3K
Replies
5
Views
4K
  • Sci-Fi Writing and World Building
Replies
9
Views
2K
Replies
3
Views
2K
  • Sci-Fi Writing and World Building
Replies
3
Views
2K
  • Mechanical Engineering
Replies
1
Views
3K
  • STEM Academic Advising
Replies
6
Views
3K
Back
Top