The hard way:
The idea here is to calculate the probability that all the digits will have been achieved at a after a given number of digit choices. If the probability is, say, p1 after 20 digits have been chosen and p2 after 21 digits have been chosen, then the probability that it takes exactly 21 chosen digits to complete the set is p2-p1.
This then breaks down into finding the probability, for various n, that by that time all 10 digits will have been chosen at least once at some time. This can be done by the inclusion/exclusion formula. In one of its simpler forms, for just three events, p(a and b and c) = p(a)+p(b)+p(c) - p(a or b) - p(b or c) - p(a or c) + p(a or b or c). You get the idea: the sum of all the individual probabilities minus the probabilities of all the pairs of events plus the probabilities of all the triplets, and continuing in this manner if there are more than three events. In this case we have ten events: getting each digit at least once.
When these probabilities are found, and then successive ones subtracted to get the probabilities of completion exactly at individual counts, each of those individual probabilities are multiplied by the respective count, and when the products are added together, you get the expected value.
A trial shows that the overall probability comes close enough to 1 when n = 330 to round to 1.000000000000000 to the implied accuracy of those zeros. The following program implements the calculation. It uses the fact that all the digits have the same probability of being chosen to enable multiplication by a combination of 10 things taken n at a time to take the place of using individual terms in the summations.
DECLARE FUNCTION combi# (n#, r#)
DECLARE FUNCTION fact# (x#)
DECLARE FUNCTION probAnyOf# (trials#, particip#)
DEFDBL A-Z
CLS
exVal = 0
FOR tr = 1 TO 333
tot = 0
FOR n = 1 TO 10
tot = tot - (-1) ^ n * probAnyOf(tr, n) * combi(10, n)
NEXT
IF tr > 9 AND tr < 55 THEN PRINT USING "## #.######"; tr; tot - prev
exVal = exVal + tr * (tot - prev)
prev = tot
NEXT tr
PRINT USING "###.######"; exVal
FUNCTION combi (n, r)
combi = fact(n) / (fact(r) * fact(n - r))
END FUNCTION
FUNCTION fact (x)
f = 1
FOR i = 2 TO x
f = f * i
NEXT
fact = f
END FUNCTION
FUNCTION probAnyOf (trials, particip)
'prob of any of the #of participants out of the 10 given #of trials
' equals 1 minus prob of none
probNone = ((10 - particip) / 10) ^ trials
probAnyOf = 1 - probNone
END FUNCTION
The probabilities of requiring exactly n choices to achieve all the ten digits are shown for n=10 through 54:
n p
10 0.000363
11 0.001633
12 0.004191
13 0.008083
14 0.013046
15 0.018634
16 0.024360
17 0.029785
18 0.034578
19 0.038529
20 0.041536
21 0.043587
22 0.044733
23 0.045068
24 0.044707
25 0.043772
26 0.042382
27 0.040648
28 0.038670
29 0.036531
30 0.034303
31 0.032043
32 0.029796
33 0.027597
34 0.025473
35 0.023442
36 0.021516
37 0.019702
38 0.018005
39 0.016425
40 0.014960
41 0.013607
42 0.012362
43 0.011218
44 0.010171
45 0.009213
46 0.008340
47 0.007544
48 0.006821
49 0.006163
50 0.005567
51 0.005026
52 0.004536
53 0.004093
54 0.003692
The resulting expected value, using all n up to 333, is 29.289683, which would be the answer.
The easy way:
We expect to get one of the digits on just one trial.
After getting that one digit, we need the remaining nine digits. Each trial has 9/10 probability of getting a new digit (other than the first chosen). The expected length of time to get the second digit is 10/9.
Likewise the expected number of selections to achieve the third digit is 10/8.
In all, getting all ten digits would then be expected to take
10/10 + 10/9 + 10/8 + 10/7 + 10/6 + 10/5 + 10/4 + 10/3 + 10/2 + 10/1
which equals 7381/252 = 29.28968253968253968253....
|
Posted by Charlie
on 2012-07-06 13:22:55 |