An Algorithm Challenge (finding the five fastest horses out of a large group)

AI Thread Summary
To determine the three fastest horses out of twenty-five with the least number of races, a strategy involving seven races can achieve this goal. The discussion highlights that fewer than six races would not allow for proper comparisons, while running every horse against each other would require up to 150 races. A proposed method involves racing the top horses and gradually introducing new ones to narrow down the fastest. Additionally, a related problem about identifying a lighter item among identical weights is discussed, emphasizing the need for an optimal weighing strategy that can reduce the number of measurements needed. The conversation reveals complexities in both problems, showcasing the importance of strategic thinking in problem-solving.
Hornbein
Gold Member
Messages
3,394
Reaction score
2,759
As a Bedouin schiek you own twenty-five horses. You want to find the three fastest. You have no clock or other device that measures time. Your racing field is wide enough that five horses can race unimpeded, so you can race five at a time and see how they place. You don't want to abuse your mounts or jockeys so...how to schedule the minimum number of races that will give you what you want to know?

I couldn't do it but a day later the answer popped into my head.
 
Technology news on Phys.org
I can do it in seven races.
Round 1: Split your horses into five fives and run five races.

Round 2: Race the winners of each race from round 1. Label the horses A-E in order of best to worst times.

A, B and C are possibles but D and E are out and, since D and E were the fastest in their round one races, so are all the other horses in their races.

C is currently our slowest candidate, and is known to be faster than all other horses in its round 1 race, so they're out too.

B is our second-slowest candidate, so #2 in its round 1 race could be faster than C. All others are out because, although they could be faster than C (not ##c##), they are not faster than A, B, or the #2 in B's race and hence cannot be in the top three.

A is the fastest of the fastest, so is definitely in. #2 and #3 in its race could be faster than B or C. All others are out, because even if they're faster than B or C we know they're slower than A, A's #2 and A's #3.

So we have five horses remaining in consideration: B, C, A's #2 and #3, and B's #2. These are all slower than A and faster than the eliminated horses, but we don't know the complete order among them.

Round 3: Race those five. Keep the top two, along with A.

Five races in round 1 plus one each in rounds 2 and 3 is a total of seven.
...assuming, of course, that we have spherical horses travelling in a vacuum so they always run at the same speed - otherwise there isn't a well defined ordering.
 
Last edited:
  • Like
  • Haha
Likes .Scott, DrClaude, PeroK and 1 other person
Here's another question. You have a number of items that are all same weight, except one which is lighter than the others. You have a set of scales that can hold any number of items in each side. See image.

What is the optimum strategy for identifying the odd item out?

E.g. suppose there are 96 items. How many times do you need to use the scales?

1687001250546.png
 
  • Like
Likes pinball1970
Log3(n)
 
Ibix said:
...assuming, of course, that we have spherical horses travelling in a vacuum so they always run at the same speed - otherwise there isn't a well defined ordering.
Are those the fabled Aroundian breed?
 
Ibix said:
I can do it in seven races.
As can I.

The strategy adopted by the OP is "wait for a flash of insight". A fine strategy, one I have used myself on many occasions. But can we give a little more shape to the problem?

The answer must be at least 6. Fewer than 5 and you havem't run all the horses. Five races gives you no comparison between horses in different races, so it must be at least 6.

On the other extreme, I could run every horse against every other jorse (two pairs at a time to use 4 of the 5 lanes) and you need 25 (horses) x 24 (horses - 1) / 2 (order doesn't matter) / 2 (2 sets of 4 lanes) oe 150. So now the answer is bounded between 6 and 150.

Now see where the "obvious" algorithm falls. Run a race, keeping the top 3. Add two horses from the 22 that have not bveen run, and run another race of 5. Keep the top 3 and kep going. You have your answer after 12 races. So the answer needs to be between 6 and 12.

By this point, you know the answer is one of 7 numbers, and hopefully have gained enough insight to see the problem: on average the fastest horse is raced 6 times. You are also gaining information on the relative rankings of the slow horses.

If only there were a way to quickly identify the very fastest and very slow horses and remove them from further consideration....
 
Last edited:
PeroK said:
You have a number of items that are all same weight, except one which is lighter than the others.
How can the items all be the same weight, but one of them is lighter. Sounds like they are exactly the same, except different.
 
  • Haha
  • Sad
Likes PeroK, Vanadium 50 and phinds
Mark44 said:
How can the items all be the same weight, but one of them is lighter. Sounds like they are exactly the same, except different.
Hey now "all except one" is a valid, well-defined value. 🤔 :wink:
n-1
 
  • Like
Likes hutchphd and PeroK
Baluncore said:
Log3(n)
If ##n =4##, then we can definitely find the item in ##2## measurements and possibly in ##1## measurement. If we have a strategy of putting one item on each side of the balance, then we get ##1## or ##2## with equal probability. We have possible answers for ##n = 4## of ##2## (maximum) or ##1.5## (average).

Whereas, ##\log_3(4) \approx 1.26##. That's not the answer. Although ##\log_3(n)## rounded up to the nearest integer should give the maximum number of measurements.
 
  • #10
PeroK said:
That's not the answer.
Correct. It was a hint, not a full answer.

Log3(96) ≈ 4.154 ; So a maximum of 5 comparisons.
That is because the partial 0.154 comparison, will still require a comparison of something.
 
  • #11
Baluncore said:
Log3(n)
Huh. That hint made it too easy.
 
  • #12
PeroK said:
You have a number of items that are all same weight, except one which is lighter than the others.

Mark44 said:
How can the items all be the same weight, but one of them is lighter.

DaveC426913 said:
Hey now "all except one" is a valid, well-defined value.
Right, but the statement in the first quote says that there "a number of items that are all the same weight..." followed by "except one"

A clearer and precise way of writing this is "in a collection of items, all but one are the same weight, and the remaining one is lighter." See, that wasn't so hard.
 
  • #13
@PeroK, I see that you have responded twice to my posts with a Sad emoji. Do you not understand that what you wrote is at best unclear? Or that there is a difference between what you wrote and saying "all but one are the same weight, and the remaining one is lighter"?
 
  • #14
Mark44 said:
@PeroK, I see that you have responded twice to my posts with a Sad emoji. Do you not understand that what you wrote is at best unclear? Or that there is a difference between what you wrote and saying "all but one are the same weight, and the remaining one is lighter"?
You're wrong. My usage is correct, standard English.

https://www.macmillandictionary.com/dictionary/british/except_1

One example given by MacMillan is:

She was dressed all in black except for a white lace collar.
 
  • Like
  • Love
Likes hutchphd and Hornbein
  • #15
PeroK said:
My usage is standard English.
Maybe so, but it sounds clumsy to my ear.

You have a number of items that are all the same weight, except one whichthat is lighter than the others.

I like this better: All but one of the items are the same weight, and the remaining one is lighter.
 
  • #16
Back to the problem. I found one anomaly to the rule that the sequence increases with increasing ##n##.

For n = 6, the best average is 2. If you measure 2 on each side, then you always take two measurements. And, if you measure 1 on each side first, you still get an average of two measurements. Also if you measure 3 on each side.

For n = 7, however, if you measure 3 on each side you get the answer in two measurements six times out of seven and in one measurement one time in seven, giving an average of 13/7 measurements.

This means that the best approach for n = 18 is not to measure 6 against 6 first. Seven against seven is better. And, in fact five against five is equally good.

And, in fact, 13 and 21 are anomalies as well. You can do 13 faster than 12 and 22 faster than 21 (on average).

The general optimum strategy is not simple, apparently.
 
  • #17
PeroK said:
Here's another question. You have a number of items that are all same weight, except one which is lighter than the others. You have a set of scales that can hold any number of items in each side. See image.

What is the optimum strategy for identifying the odd item out?

E.g. suppose there are 96 items. How many times do you need to use the scales?

View attachment 327979
I would weigh half against half then split the lighter one each time. When I get to the last 3 I pinch one from the previous non light. Then it's 2 v 2 then 1v1. 6?
 
  • #18
pinball1970 said:
I would weigh half against half then split the lighter one each time. When I get to the last 3 I pinch one from the previous non light. Then it's 2 v 2 then 1v1. 6?
You can do better that that.
 
  • #19
pinball1970 said:
Last 3 just one v one. They are even it's the other, or one is lighter.
48, 24,12,6,3 then last makes 5.
A balance can read three things: left heavier, right heavier, or both sides the same. Your methodology only ever produces two of the possible outputs, so you are wasting opportunities to gain more information at every step.
 
  • #20
Ok. Have to dash but I will come back to this
 
  • #21
pinball1970 said:
Ok. Have to dash but I will come back to this
The binary solution is the obvious choice, but is not optimum.
It is well worth the return.
 
  • Like
Likes pinball1970
  • #22
Baluncore said:
The binary solution is the obvious choice, but is not optimum.
It is well worth the return.
Ok thanks for waiting and no spoilers. Just finished a four rehearsal and I was thinking of this all the way through!
 
  • #23
Mark44 said:
Maybe so, but it sounds clumsy to my ear.
Ah, but "clumsy" is a far cry from wrong (not that you actually said it).
Wrong would be worth a point-of-order; clumsy, not so much.
OK, moving on... :smile:
 
  • #24
DaveC426913 said:
Ah, but "clumsy" is a far cry from wrong (not that you actually said it).
Wrong would be worth a point-of-order; clumsy, not so much.
It's not even clumsy. For example:

All prime numbers are odd except ##2##.

The modulus function is differentiable everywhere except ##x = 0##.

That's just standard English usage, IMO.
 
  • #25
PeroK said:
All prime numbers are odd except 2.
Which must make 2 the oddest prime of all.

That's just standard English usage.
 
  • #26
DaveC426913 said:
Ah, but "clumsy" is a far cry from wrong (not that you actually said it).
The relevant part here is that I didn't actually say it.
PeroK said:
All prime numbers are odd except 2.

The modulus function is differentiable everywhere except x=0.
These are reasonable examples of usage in that the "except" part isn't a separate clause that negates the universal clause. However, the first could be improved like so: "All prime numbers except 2 are odd.

Consider these, though:

All mammals are viviparous, except some aren't. (E.g., the platypus lays eggs.)
All integers are even, except some aren't.
A Ford Model T buyer can have the car in any color, except it must be black. (Paraphrase of a comment by Henry Ford.)

The conjunction "except" negates the preceding clause, so non-clumsy usage would move the "except" part inside the initial clause.

This is similar to statements that include "only" in the wrong place, such as the following:
"I only went fishing with Bill." Is the intent that the sole activity was fishing or was it that the sole companion was Bill? IOW, is it "only fishing" or "only Bill"?
 
  • #27
Baluncore said:
Which must make 2 the oddest prime of all.
Six is odd too. A horse has two legs in back and forelegs in the front, and 2+4=6 legs in total, which is certainly an odd number of legs for a horse.

And four - the Oracle at Delphi warned Croesus about attacking Persia, and forewarned is forearmed. Which is certainly an odd number of arms for a man.

Now, can we split this digression off and move it where it belongs?
 
  • #28
I cannot do the horses.
For the scales question.
Ok, Splitting into 3 lots of 32 I can get to a lower number quicker?
Instead of 48, 24, 12, 6,3. I get 3x 32 and I can dismiss 64 items straight away.
Then a 2x10 and a 12 let's assume it's in the 12.
Then I have 3x 4 then I have just 4.
Three moves to 4 quicker but..
 
  • #29
That's it for the scales, yes.

With the horses you should be able to find the fastest easily enough. At the point you know which one is the fastest, think about which horses could possibly be second fastest.
 
  • Like
Likes pinball1970
  • #30
pinball1970 said:
I cannot do the horses.
For the scales question.
Ok, Splitting into 3 lots of 32 I can get to a lower number quicker?
Instead of 48, 24, 12, 6,3. I get 3x 32 and I can dismiss 64 items straight away.
Then a 2x10 and a 12 let's assume it's in the 12.
Then I have 3x 4 then I have just 4.
Three moves to 4 quicker but..
That's the idea. You can reduce the number by (approx) a factor of 3 each time.

It's then a question of what criteria you use to assess your strategy. The maximum number of measurements can be calculated using powers of three (or the inverse process: log to the base ##3##).

If, instead, we look at the expected (average) number of measurements, then it gets suprisingly tricky. The example I gave above is that for ##18## items, you can do better than measuring ##6## against ##6##. That's because you can find one light item from ##7## statistically faster than one light item from ##6##. To be honest, I didn't spot that until I wrote a Python script to check all the possibilities!
 
  • Informative
Likes pinball1970
  • #31
PeroK said:
That's because you can find one light item from 7 statistically faster than one light item from 6.
Not sure why 6 and 7 needed to be in LaTex, but...

At this point, you have a pile of coins you know not to be the one. So if 7 is faster than 6 (I haeven't checked) you can speed up a 6 by tossing in one of the normal coints into the pile. Now you have 7.
 
  • #32
Vanadium 50 said:
Not sure why 6 and 7 needed to be in LaTex, but...

At this point, you have a pile of coins you know not to be the one. So if 7 is faster than 6 (I haeven't checked) you can speed up a 6 by tossing in one of the normal coints into the pile. Now you have 7.
If there is an additional coin that we know is normal, then the algorithm can be improved for six coins. We put the known normal coin along with two others into one side of the balance. This gives a slighty greater probability (##\frac 5 6##) of the light coin being left out and identified in one measurement.

The expected number of measurements are: ##2##, ##\frac {13} 7## and ##\frac {11} 6## for six, seven and six plus one known normal coin respectively.

This technique, however, does not quite improve the algorithm enough for the case of 18 coins:

Using the basic 6-6 + 6 idea gives an expected ##3## measurements

Using the extra coin when we have 6 gives an expected ##\frac {17} 6## measurements.

But, the 7-7 + 4 technique is still better with an expected ##\frac {50}{18}## measurements.
 
Last edited:
  • #33
I'm looking at the horse one.

For the horse puzzle.

5 x groups of 5, A-E. Group A, horses 1-5, Group B, 6-10 and so on.

All 5 groups race, produce a winner who then race in a winner of winners (WOW) and produce the fastest horse (Group A, horse number 1 say), who is now eliminated.

I then want to put the number twos from the heats against the runners up of the WOW.

Problem I see is if I brute force it that way, I will end up with a lot of races where horses have already raced each other.

I have to go three deep in case heat A had the three fastest.
I would not know that till I put that horse in with WOW runners up.
 
  • #34
pinball1970 said:
5 x groups of 5, A-E. Group A, horses 1-5, Group B, 6-10 and so on.
Let's label it a little differently. In the Race of Winners, label the fastest horse A1, the next fastest B1, and so on. Now look at the prelininaries: label the second fastest horse in the race with A in it A2, the third fastest A3, and so on. Label the second fastest horse in the race with B in it B2, the third fastest B3, and so on.

There is nothing magical about this labeling; it's just convenient.

Can A1 be the 2nd fastest? No...it's the fastest. Can E5?
 
  • #35
I didnt read it or did not take it in. I am trying from scratch, its on the net too (I checcked)
 
  • #36
PeroK said:
That's the idea. You can reduce the number by (approx) a factor of 3 each time.

It's then a question of what criteria you use to assess your strategy. The maximum number of measurements can be calculated using powers of three (or the inverse process: log to the base ##3##).

If, instead, we look at the expected (average) number of measurements, then it gets suprisingly tricky. The example I gave above is that for ##18## items, you can do better than measuring ##6## against ##6##. That's because you can find one light item from ##7## statistically faster than one light item from ##6##. To be honest, I didn't spot that until I wrote a Python script to check all the possibilities!
A quick up date, I am looking at the horse one but I had an idea with this.

from 96 to 3 x32 then 2x10 and 12 followed by 3x4 and I am left with 4.

I then just weight two, I have a one in 4 chance of getting it 4 weighs and definitely can do it in 5.
 
  • #37
pinball1970 said:
Problem I see is if I brute force it that way, I will end up with a lot of races where horses have already raced each other.
Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?
 
  • #38
pinball1970 said:
I would not know that till I put that horse in with WOW runners up.
@Ibix solved it in post #2. The observation is that the WOW race ranking eliminates additional horses from the the initial races of the horses that did not win the WOW race. You are left with 5 horses for the final 2 slots in the 7th race.
 
  • #39
Vanadium 50 said:
Let's label it a little differently. In the Race of Winners, label the fastest horse A1, the next fastest B1, and so on. Now look at the prelininaries: label the second fastest horse in the race with A in it A2, the third fastest A3, and so on. Label the second fastest horse in the race with B in it B2, the third fastest B3, and so on.

There is nothing magical about this labeling; it's just convenient.

Can A1 be the 2nd fastest? No...it's the fastest. Can E5?

Ibix said:
Yup. Is there a horse in the WOW race that could be second fastest? If so, which horses do you already know are slower than it?

and @Vanadium 50

Only A2 and B1 can be number two. A1 cannot as it is fastest

Assume A2 wins that.

Now only B1, C1 or A3 can be 3rd.
 
  • #40
OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?
 
  • #41
Vanadium 50 said:
OK, you;re almost there. You have five lanes, and how many horses after identifying A1 as the fastest?
Damn it!

A2/A3 can 2 or 3, B1 also, C1, B2 or A3 can be 3rd.

One race of A2, A3, B1, B2 and C1 should give final positions.

I think!
 
  • Like
Likes Ibix, Vanadium 50 and PeroK
  • #42
For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?
 
Last edited:
  • #43
Vanadium 50 said:
For extra credit, repeat this many times. Is E5 on average slower than A5? If so, how is it possible that the speed of two other horses in the race can possibly influence the speed of these two horses?

For extra extra credit, is B4 on average faster than A5?
Do you mean once I have ran the heats and winners of those to get the fastest?
 
  • #44
I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?
 
  • #45
Vanadium 50 said:
I mean you run this with 25 million horses. A1 is always fastest, by construction. If you compare the average speed of the million horses that end up in E5 with the million in A5 which is faster?

If one is faster, how is this possible?
If I understand the question, you can model this by considering the numbers 1-25 put at random into five groups of five. The numbers in each group are then sorted in ascending sequence. We then choose group A as the group with the number 1 in it. A1 is the number 1 and A5 is the highest number in that group. Group B is the group with the next lowest first number. E.g., if the number 2 is not in group A, then group B will be the group with the number 2 in it. Group E is the group with the highest first number.

It's clear than the distribution for E5 has a higher mean than the distribution for A5.

This could be simulated using a Python (or C++) script, or possibly calculated - if it wasn't so late in the day.

PS the lower the number, the faster the horse it represents.
 
  • #46
PeroK said:
It's clear than the distribution for E5 has a higher mean than the distribution for A5.
i.e. a slower horse.

I agree it is true, but why is it clear? How can the relative speeds of two other horses make a difference here?

The direction I am going here is that, while 7 races is the answer, it looks like we have no information on the 4th fastest horse,m or the 11th fastest horse, or the slowest horse, when it fact we do.
 
  • #47
Vanadium 50 said:
i.e. a slower horse.

I agree it is true, but why is it clear?
Note that the 16th fastest horse, for example, could equally be in any group; but, it's more likely to be the slowest horse in group A than the slowest horse in group E. It might even be the fastest horse in group E.

By definition A1 is lower than B1 is lower than C1 etc. Statistically, this ordering must persist for the set A2-E2 etc. It's not possible for E2 to have the same distribution as A2. E2 must be at least the number 6; whereas, A2 could be in the range 2-5. And, so on to the set A5-E5. E5 must be higher on average than A5.

PS as a very rough estimate, I'd say that E1 has an average value of 10-12. I'll do a simulation tomorrow if I get the chance.
 
  • #48
I’ll be the idiot.
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
The sixth and seventh race only involve horses with zero probability of being the slowest. Given that we have learned nothing new about the slowest, the probabilities do not change.
What am I doing wrong?
 
  • #49
Frabjous said:
After the first five races, there is equal probability (1/5) that that slowest horse is A5,B5,C5, D5 or E5.
You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.
 
  • #50
Ibix said:
You can't label them A-E at this point, because that labelling is only available after the sixth race. At that point, though A5 can be faster than all of B1-E1, but E5 must be slower than all of them. So the expectation speed of A5 seems likely to be higher than that of E5.
Got it. Thanks.
 

Similar threads

2
Replies
83
Views
21K
2
Replies
67
Views
14K
Replies
3
Views
3K
Replies
1
Views
4K
Replies
6
Views
5K
Back
Top