A book titled The Bible Code introduced the topic of equidistant letter sequences (ELS), described below, for finding words “hidden” in text. That book referenced the Hebrew Bible, but prompts a question about finding any given word in any, say, English-language text.
For simplicity, and to better match the Hebrew, spaces and punctuation are removed. A particular text that I have in mind, thus crunched, has 284,939 characters remaining (letters and digits). How many times would you expect to find the word FLOOBLE as an equidistant letter sequence in the text? Ignore case. The word can start at any of the 284,939 characters and proceed by skipping any constant number of letters forward or backward. So, for example, if the 11,000th character were an F and the 10,000th an L, and the 9,000th an O, etc. that would be one occurrence. Of course we don’t expect always to find such decimally round spacings. The question again, How many do we expect to find?
The absolute and relative frequencies of the relevant letters in the text are:
B 4771 0.016744
E 36232 0.127157
F 7167 0.025153
L 9563 0.033562
O 22486 0.078915
that is, for each letter is shown the number of occurrences in the text and that number divided by the total of characters in the text.
As Charlie has stated, the large number of "trials" involved in this situation allows us to assume that the trials are independant. Consider that to calculate this probibility without the assumption of independance would require the compilation of all possible 284939 character text strings which contain the correct frequencies of the relevant letters, counting the occurances within each, summing these values and then dividing by 284939. The time required for a solution to this problem is probably prohibitive in this forum.
Now as a side note - I'd like to thank Charlie for having the same problem grasping the meaning of the mathematical term "expected" as I do. I maintain that you can't expect a discreet event from happening a non-discreet amount of times (i.e. you'd not expect 5.08 occurances of "flooble").
P.S. am I spelling probibility correctly?