In the computer solitaire game "Freecell", statistics are kept for the largest winning streak, largest losing streak, and the percentage of games won.
Being mildly addicted to the game, I was not suprised to see that I had won 60 consecutive games while only losing 3 consecutive games.
Two questions based on this information:
1) What is my expected percentage of games won?
2) How many more games will I have to play before I have an even chance of a winning streak of 100 games?
There are a couple of problems trying to figure this out. We need to know the expected length of a run of successes or failures based on the probability of a win in each trial and the number of games played; and then, in order to get the expected value of the probability, given the lengths of the runs, we'd need to do a Bayesian analysis, going from the observed run lengths back to the probability of a win in a given game.
But if we seek only an approximation, we can look for the modal value of the individual probability of a win and number of games played that will lead to expected run lengths that match those observed. This is a modal value in that it is the one that gives the most likelihood of the observation, but it is not the average of the weighted probabilities that produce the observed outcome.
The following program simulates multiple trials, where ssize represents the number of games played and p is the probability of a win in an individual game. The values for these were varied until results matching the observations were found.
p = .9056
ssize = 2355
RANDOMIZE TIMER
FOR trial = 1 TO 500
hct = 0: ctmax = 0: fct = 0: fctmax = 0
FOR i = 1 TO ssize
r = RND(1):
IF r < p THEN
hct = hct + 1: IF hct > ctmax THEN ctmax = hct
fct = 0
ELSE
hct = 0
fct = fct + 1: IF fct > fctmax THEN fctmax = fct
END IF
NEXT
ctmTot = ctmTot + ctmax
fctmTot = fctmTot + fctmax
tct = tct + 1
NEXT
PRINT p; ssize; ctmTot / tct, fctmTot / tct, LOG(ssize) / LOG(1 / p)
Each row of the output represents 500 trials:
p ssize avg max win avg max lose "predicted" max win streak
streak streak
.9056 2355 59.576 2.978 78.30261
.9056 2355 58.784 3.028 78.30261
.9056 2355 59.258 2.944 78.30261
.9056 2355 58.632 2.996 78.30261
.9056 2355 59.426 3.004 78.30261
.9056 2355 60.082 2.982 78.30261
.9056 2355 59.214 2.96 78.30261
.9056 2355 59.954 2.992 78.30261
The "predicted" max win streak column is based on a formula found on the web, searching for "expected longest run". It clearly overestimates the average maximum winning streak.
So at this point, the best guess is that the probability of a win in a given game is about .9056, and that 2355 games had been played.
Leaving p the same, and increasing ssize we can seek how many games need to be played to expect a winning streak of 100 games:
ssize = 115000
results in an expected largest winning streak of about 100. But I've added a new columnwhat fraction of the time the largest streak is at least 100:
p games exp max win exp max lose fraction
streak streak at least 100 web prediction
.9056 115000 99.578 4.622 .42 117.5168696365425
.9056 115000 99.994 4.616 .43 117.5168696365425
.9056 115000 99.728 4.614 .432 117.5168696365425
.9056 115000 99.694 4.614 .414 117.5168696365425
.9056 115000 100.212 4.626 .448 117.5168696365425
Note the web formula still overpredicts. More importantly, note that in only about 42% or 43% of cases does a streak of 100 appear. That's not "an even chance". This is an example of the mean not being the median.
If we increase the number of games in each trial to 139,000 or 140,000, we get about an equal possibility of achieving or not achieving a string or 100 wins, as given by the following reports:
.9056 139000 101.952 4.692 .498 119.4283908833486
.9056 139000 101.894 4.688 .498 119.4283908833486
.9056 139000 101.938 4.688 .488 119.4283908833486
.9056 139000 102.226 4.704 .502 119.4283908833486
.9056 139000 101.768 4.688 .492 119.4283908833486
.9056 140000 102.274 4.704 .504 119.500684802691
.9056 140000 101.962 4.688 .502 119.500684802691
.9056 140000 102.05 4.712 .494 119.500684802691
.9056 140000 102.108 4.694 .514 119.500684802691
.9056 140000 102.022 4.694 .506 119.500684802691
Again, each row represents the average of 500 trials of the given number of games. We see that the expected maximum winning streak is about 102, the mean.
The 139,000 or 140,000 begin now, as the first 2355, or whatever, played already, have already been ascertained not to contain a 100game winning streak.

Posted by Charlie
on 20081026 13:40:46 