Suppose there are 20 problems on the site the Sunday evening a user looks at the site the first time. She doesn't read the newest problem of the day, but instead ONLY reads problems that come up on the "Random Problem" page. She reads 5 random problems each day, always in the evening. The only problem is she can see the same problem more than once.
The other problem is problems continue to come in from the infinite queue. Two per week day and one per weekend.
What is the probability she will have read all the problems after her next Sunday evening check? What is the probability she will have read all the problems after the Sunday evening after that?
Note: The problems displayed when you click on "Random Problem" are independent of each other. There isn't anything to make sure that you are getting five different problems if you click on random problem five times.
This problem presents a Markov chain of probabilities that the user will have x unseen problems at any given stage of the twoweek period. At each time there is a certain probability there are no problems left unseen, a certain probability that 1 is still unseen, etc. Call these probabilities u(0), u(1), etc. They change as time progresses.
After that first Sunday's problem posting, there are 20 problems on the site, all of them unseen by this user. So the probability that 20 are unseeni.e., u(20), at this stage, is 1 and all the other probabilities are zero.
Each time the user looks at a random problem, the probability that the number of unseen problems goes down by 1 is just the number of unseen problems divided by the total number of problems on the site. So, if n is the number of problems on the site at a given stage, the new value for each u(i) is equal to the old value of u(i) times (1u(i))/n plus the old value of u(i+1) times u(i+1)/n, as each old probability has conditional probabilities of u(i)/n of going one lower, or (1u(i))/n of remaining at the same value.
Of course every time a puzzle is posted, it is unseen by the user, so each probability is advanced one position in the array, that is each new u(i+1) is equal to the old u(i).
When two problems are posted, it's perfectly OK to transition the new u(i+2) = the old u(i) for all i. However, you can't do that when transitioning for the user's seeing new problems, as the transition probabilities change, and each transition has to be done individually, five times per evening.
The program shown below computes this Markov chain for the 15 days asked for, and shows the reciprocal of the probability u(0) (i.e., the probability that zero remain unseen), at the end of days 8 and 15.
The results are:
15 16 17 18 20
1 .581 .363 .053 .002
12 13 14 15 16 17 22
2 .084 .298 .365 .197 .050 .006
9 10 11 12 13 14 15 16 17 24
3 .003 .029 .122 .257 .298 .197 .076 .017 .002
8 9 10 11 12 13 14 15 16 17 26
4 .004 .027 .094 .199 .266 .228 .126 .045 .010 .001
7 8 9 10 11 12 13 14 15 16 17 28
5 .003 .015 .054 .133 .218 .244 .188 .099 .036 .009 .001
6 7 8 9 10 11 12 13 14 15 16 17 30
6 .001 .005 .023 .070 .146 .217 .228 .171 .092 .035 .009 .002
5 6 7 8 9 10 11 12 13 14 15 16 31
7 .001 .006 .024 .068 .140 .207 .222 .175 .100 .042 .013 .003
4 5 6 7 8 9 10 11 12 13 14 15 16 32
8 .001 .004 .018 .055 .119 .189 .220 .189 .121 .057 .020 .005 .001
563860447.1492637
5 6 7 8 9 10 11 12 13 14 15 16 34
9 .003 .013 .041 .097 .167 .212 .202 .144 .077 .031 .009 .002
5 6 7 8 9 10 11 12 13 14 15 16 17 36
10 .002 .008 .029 .075 .141 .198 .209 .167 .102 .047 .017 .004 .001
5 6 7 8 9 10 11 12 13 14 15 16 17 38
11 .001 .005 .020 .055 .113 .176 .207 .187 .129 .068 .028 .009 .002
5 6 7 8 9 10 11 12 13 14 15 16 17 18 40
12 .001 .003 .012 .038 .087 .149 .196 .198 .155 .094 .044 .016 .005 .001
6 7 8 9 10 11 12 13 14 15 16 17 18 42
13 .002 .008 .025 .063 .120 .176 .200 .177 .122 .066 .028 .009 .002
5 6 7 8 9 10 11 12 13 14 15 16 17 18 43
14 .001 .004 .014 .041 .088 .147 .190 .193 .153 .096 .047 .019 .006 .001
5 6 7 8 9 10 11 12 13 14 15 16 17 18 44
15 .001 .006 .022 .056 .110 .166 .195 .181 .132 .077 .035 .013 .004 .001
1291015550.712306

The numbers in the left column are the day numbers. The numbers on the right are the total number of problems on the site at the end of that day. The top row of numbers in the middle for each day show the values of x for which u(x)>.0005 for that day, and just below that is the corresponding u(x).
We see that the probability the user would have seen all the problems (non left unseen) on day 8 would be 1/563,860,447. At the end of day 15 it would be 1/1,291,015,551.
Note that on the first day, the most likely unseen amount is 15, but the drop from day to day slows down quite soon.
The program is:
CLS
DEFDBL AZ
maxSub = 100
DIM unseen(maxSub), newRow(maxSub)
'unseen contains the prob that that many (its subscript) are still unseen
totProbs = 19
unseen(19) = 1
FOR day = 1 TO 15
IF day MOD 7 = 1 OR day MOD 7 = 0 THEN
totProbs = totProbs + 1
shift = 1
ELSE
totProbs = totProbs + 2
shift = 2
END IF
FOR i = maxSub TO 0 STEP 1
IF i >= shift THEN
unseen(i) = unseen(i  shift)
ELSE
unseen(i) = 0
END IF
NEXT
FOR rProb = 1 TO 5
REDIM newRow(maxSub)
FOR i = 0 TO totProbs
pNew = i / totProbs
IF i > 0 THEN newRow(i  1) = newRow(i  1) + pNew * unseen(i)
newRow(i) = newRow(i) + (1  pNew) * unseen(i)
NEXT
FOR i = 0 TO totProbs
unseen(i) = newRow(i)
NEXT
NEXT
FOR i = 0 TO totProbs
IF unseen(i) >= .0005 THEN
PRINT USING " ##"; i;
END IF
NEXT
PRINT TAB(75); totProbs
PRINT USING "##"; day;
FOR i = 0 TO totProbs
IF unseen(i) >= .0005 THEN
PRINT USING " .###"; unseen(i);
END IF
NEXT
PRINT
IF day MOD 7 = 1 AND day > 1 THEN
PRINT 1 / unseen(0)
END IF
PRINT
NEXT

Posted by Charlie
on 20040209 12:28:05 