I am happy to report that I finally completed the task I undertook in February and announced in
post #6,496.
Here is a summary of what I did, what I observed and what I concluded.
What I did
- Devise an initial filter of the same 4 five-letter words and use it to separate the list of 12792 five-letter words into subgroups.
- Assume that all words have equal probability of being the target, i.e. ignore "previously used" words.
- After this initial separation, use additional filters to reduce a subgroup into sub-subgroups until reduction to a subngroup of 1 word.
- Assign a score equal to the total number of filter words used to determine the target with 100% certainty. An exception to this was the 50-50 subgroup: If, after N filters there were only 2 words in a subgroup in which case there is an equal probability for a score of N+1 or N+2, I assigned an in-between score of (N+1).5.
- I used CARED BEWIG LUMPY KNOTS, in that order, for the initial filter. It was applied sequentially to the entire list. The only exceptions were the filter words themselves. To BEWIG LUMPY KNOTS, I assigned respective scores of 2, 3 and 4 without further ado. In the case of CARED, I considered the 12 members of subgroup _ ARED and found paths to all the members with CARED as the initial guess. The scores varied from 3 to 5.5 except the starting CARED to which I assigned the one and only score of 1 in the entire list.
What I found out
The bottom line is that, using the 4-word filter presented above, it is possible that paths to all 12792 words can be found with no more than 7 filter words. However, the probability for that to happen is 50-50. That's because there one subgroup which is highly degenerate and well-populated. It is the _ E _ TS subgroup,
FESTS FETTS HEFTS HESTS JESTS SETTS SEXTS TESTS TEXTS VESTS ZESTS
Using FISHY as filter 5, separates subgroups as follows
one-word {FESTS} {FETTS} {HEFTS} {HESTS}
three-word {SETTS SEXTS TEXTS}
four-word {JESTS TESTS VESTS ZESTS}
The one-word subgroups are solved with one additional identity filter 6 for a score of 6.
The three-word subgroup is solved with filter 6 TEXTS for a score of 6 or 7.
There is no 5-letter word that can be used to solve the four-word subgroup for a score of 7. The best that one do is to use a word that contains V and Z, e.g. VIZIR, for filter 6 to get VESTS or ZESTS for a score of 7. This leaves the two-word subgroup, {JESTS TESTS} and one guess left. The solution with one additional guess is possible if one is lucky but not guaranteed. Thus, the two words are assigned a score of 7.5 each. I should note that these are the only words that do not have a guaranteed solution with no more than 7 guesses.
Summary
The table on the right summarizes the results from this survey. It shows that the brute force use of this 4-word filter is sufficient to solve more than 70% of the 12,792 words with one additional guess, the identity filter. The scores below 5 belong exclusively to the _ARED group as explained above.
An example is the last word on the list, ZYMIC that has a score of 5. Who would have thought it? However a closer look shows that for this word
- ABDEGKLNOPRSTUW are excluded
- M and I appear in the correct position
- C is in but not in position 1 and Y is in but not in position 5.
which is sufficient to produce a one-word subgroup. In fact all words that start with Z are 5 pointers.
Afterthoughts
I spent considerable time on the initial filter design to maximize hits and using letters only once. Eventually, though, I thought that it would be a good idea to use E twice in positions 2 and 5 and sacrifice H. Is there a better filter than this out there? I don't know, but at least I have provided a table that can be used as a benchmark in case someone else gets the same idea and has the time.