Moden Infinite Monkey's standard deviation

Click For Summary

Discussion Overview

The discussion revolves around a hypothetical scenario involving a mechanical chimpanzee typing randomly to reproduce the text of the King James Bible. Participants explore the mathematical implications of this scenario, particularly focusing on the probability of duplication and the standard deviation of the number of key presses required to achieve this.

Discussion Character

  • Exploratory
  • Mathematical reasoning
  • Debate/contested

Main Points Raised

  • One participant calculates the probability of the chimpanzee successfully typing the King James Bible as (0.004)^4,245,026, resulting in approximately 5.6 X 10^36.
  • Another participant questions the notation used in the probability expression, suggesting a potential typo or misunderstanding regarding exponentiation.
  • A different participant proposes that the probability distribution for the first occurrence of the King James Bible closely resembles a geometric distribution, with both mean and standard deviation equal to 1/p, where p is a very small probability.
  • Concerns are raised about the assumptions made regarding the probability of achieving the text, particularly regarding the impossibility of having fewer than 4,245,026 characters to form the Bible.
  • One participant inquires whether the relative error function derived from the probability applies to shorter strings, such as "banana".
  • A later reply emphasizes the need for additional assumptions to fully establish the premise of the discussion.

Areas of Agreement / Disagreement

Participants express differing views on the interpretation of the probability calculations and the implications of the assumptions made. There is no consensus on the standard deviation or the applicability of the derived error function to different string lengths.

Contextual Notes

Limitations include the dependence on specific assumptions about the typing process and the nature of the probability distribution, which remain unresolved.

Mr Peanut
Messages
30
Reaction score
0
Given an imaginary, mechanical chimpanzee that never wears out nor sleeps nor relents. Who is compelled to type constantly through eternity into an endless text file using a US standard keyboard... as fast as he can. That this monkey knows simply that he must hit only character keys and only one at a time. He has the additional random option of holding down dead keys (shift key & Alt key). He has no other faculties. He is purely random in nature. His access to each of the character keys is exactly equal except that striking the windows key, menu key, esc key, and control key are infinitely unlikely.
Given that he is limited by the design specifications of the best keyboard... made to support a typist that is twice as fast as the world record holder for typing (37,500 keys/50 min = 12.5 keys/sec). This design limit is 25 characters per second.
Given that he confines his speed to exactly the design limit and types 25 characters per second.
Given that the Guttenberg Project’s flat text file for the King James Bible (GPKJB) has 4,245, 026 characters in a specific sequence in the file.

then

1) The keyboard allows anyone of 256 characters to be typed per key press (ASCII 0 – 255). The probability of hitting a given key is 1/256 = ~0.004

2) The average number of keys that must be struck before he duplicates the file is:
(0.004)^4,245,026 = 5.6 X 10^36

What's the standard deviation of this average?
 
Physics news on Phys.org
Mr Peanut said:
(0.004)^4,245,026 = 5.6 X 10^36

Typo? Or does the a^b not mean a to the power of b in this case?
 
To the power of.


Some stuff in the premises will seem superfluous. For example, the typing rate. The reason it's there is; the next part of the idea is to determine expectations about how long it should take, then how long it will take multiple monkeys.

Thanks
 
(0.004)^4245026 is about {10}^{-{10}^7}

I think the probability distribution for the first occurrence of the king James bible is very close
to a geometric distribution P(X=k) = p (1-p)^(k-1) for k >=1 with

p = {10}^{-{10}^7}

both the mean and the standard deviation are equal to 1/p

This is only approximate because:

P(X<4,245,026) is 0 because you can't have a bible if you do not have at least 4,245,026 characters.
We can ignore this because 4,245,026 * p is so small compared to 1/p

If te last 4,245,026 characters were the King James bible, the next character couldn't be the end of the king james bible (unless the bible is aaaaa... ...aaaaaaa) so if the last 4,245,026 characters weren't the King James bible, the probability of that it's finished with the next character is slightly higher. This is also unimportant because it increases the probability that the last character finishes the KJB from p to 1/((1/p)-256)) about p + 256 p^2
(256 outcomes for the last 4,245,026c characters are no longer possible because the KJB didn't finish 1 character ago)
This can also be ignored because 256p^2 is so small compared to p
 
Thanks,

So, it looks like relative error= (256p^2)/p

Does this error function have this form for any size character string I try to match? That is; does it apply to a short string like " banana "?
 
You'd need a lot more "givens" than just that if you want to provide the entire premise.
 

Similar threads

  • · Replies 21 ·
Replies
21
Views
7K
Replies
1
Views
2K
  • · Replies 36 ·
2
Replies
36
Views
11K
  • · Replies 8 ·
Replies
8
Views
2K
  • · Replies 1 ·
Replies
1
Views
3K
  • · Replies 1 ·
Replies
1
Views
4K
  • · Replies 2 ·
Replies
2
Views
3K