RPinPA said:
OK, here you go. I did this manually but I'm pretty sure I got it right.
Hi RP:
I did a spot check of your numbers for the values for
f(k) = the probability a randomly selected word will have k letters,
and I found no errors. However, the chart has a value of "0" for f(1). Technically this is correct, but it gives the wrong answer to the following question.
What is the probability p(k) of randomly picking a single letter (k=1) which will be a word of length one?
There are two words of length 1, ("a" and "i") making
p(1) = 7.7%
rather than p(1) = 0.
Except for k=1, I calculated
p(k) = f(k) × N / 226,
where N = 228132, the assumed number of English words. I obtained the following values:
p(2) = 33.7%
p(3) = 7.8%
p(4) = 1.3%
p(k) [k = 5...22] = 0.0%.
The average value A = Avg( p(k) [k=1...22] ) = 2.3%.
This method of calculating A assumes one of the many possible ways to randomly choose a string of letters.
First roll a 20-sided
Icosahedron die with the numbers 1...20 to select a value of k in [1...20]. Then randomly choose a random sequence of k letters.
I am aware that this is a completely arbitrary way of randomly choosing a string of letters. However, the method described in post #2 is also arbitrary. The point I am trying to make is that any method of choosing a random string is arbitrary. There is no single "correct" answer to the question,
What is the mathematically correct way to chose a random string of letters?
This reminds me of a practical problem my wife needed to solve regarding choosing a method to doing accounting to calculate the asset value of our house. The brilliant answer she came up with is:
When choosing a method of accounting, choose the one that makes you happy.
The only mathematical principle that applies to the random choice of a letter sequence is that two separate steps are needed.
1. Randomly chose a value k for the number of letters.
2. Randomly choose a string of k letters.
Step 2 is mathematically easy. Step 1 is arbitrary. As an example, I throw into the mix one more (arbitrary) way to do step 1.
Choose a random word from a dictionary. (Choosing a dictionary is also arbitrary.) Choose as the value of k the length of the chosen word.
Regards,
Buzz