The game BINGO! (US and Canada version)is played with a card on which is drawn a 5x5 grid filled with 25 random, non-repeating integers from the range of 1 to 75 inclusive (1 number per grid location). Further constraints are that the left-most column can only contain values from 1 to 15 inclusive, the next left-most column only 16 to 30 inclusive, etc., ending with the 5th (right-most) column only containing 5 random values from the range 61 to 75 inclusive. To play, numbers are "called", one at a time and randomly, with the winner being the first to have all values in any single row, column or diagonal on his/her card "called" (thus, a BINGO!).
Without referring to the many solutions available on the web (BINGO! is very popular):
1) What is the expected value of the number of numbers that must be "called" to reach a BINGO! on a single card?
2) What if you are allowed to play 5 cards (presumably all different)?
Also, usually the center square of the grid is "free", i.e. it is assumed to be called already at the beginning of the game.
3) What are the resulting values for Questions 1) and 2) in this case?
Extra hard bonus:
There are many many variations of the game that allow changes to the pattern of numbers/grid spaces that must be "called" to reach a BINGO! A "+" and an "X" are two of these. Both require 8 numbers to be called, assuming there is a free space in the middle. What are the expected values of the number of numbers "called" for each of these?
The author admits that computer solutions are very viable to solve this problem, but BINGO! existed long before computers. Any analytical attempts/solutions get bonus points!
A pure mathematical analysis is daunting to say the least (more on this later), so I did a simulation:
The first version is without a free space in the middle of the board.
As any set of numbers is as good as any other, it uses the lowest five numbers in each column and each set is numbered 1 through 5, so for example B16 is column 2, row 1 and is identified by (2,1) in the randomization of chosen balls.
DEFDBL A-Z
DIM bd(5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
good = 0
REDIM used(5, 15): ct = 0
DO
c = INT(RND(1) * 5 + 1)
r = INT(RND(1) * 15 + 1)
IF used(c, r) = 0 THEN
used(c, r) = used(c, r) + 1: ct = ct + 1
IF r <= 5 THEN
good = 0
brk = 0
FOR rw = 1 TO 5
IF used(c, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
FOR cl = 1 TO 5
IF used(cl, r) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
IF r = c THEN
FOR rw = 1 TO 5
IF used(rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSEIF r = 6 - c THEN
brk = 0
FOR rw = 1 TO 5
IF used(6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN good = 1
END IF
END IF
END IF
END IF
IF good THEN
totct = totct + ct: tottr = tottr + 1
IF trial MOD 10000 = 0 THEN
IF INKEY$ > "" THEN RANDOMIZE TIMER: ELSE PRINT: PRINT: PRINT
PRINT ct; tottr,
PRINT totct / tottr, trial
END IF
EXIT DO
END IF
END IF
END IF
LOOP UNTIL good
NEXT trial
When run, the average number of balls called before Bingo! hovers around 43.95 and 43.96.
To take into consideration the free spot in the center, (3,3) is prepopulated. But to account for the fact that there are still 15 N balls, not just 14, that could be called, when the ball chosen is (3,3), its previous non-use is marked by the Used counter being 1, rather than the zero of the other positions. (Balls can be anywhere from (c,1) through (c,15) but only (c,1) through (c,5) are on the board.)
Only the beginning of the program changes, in fact, just two lines as marked by the comments below:
DEFDBL A-Z
DIM bd(5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
good = 0
REDIM used(5, 15): ct = 0
used(3, 3) = 1 ' this is mod for free space
DO
c = INT(RND(1) * 5 + 1)
r = INT(RND(1) * 15 + 1)
IF used(c, r) = 0 OR c = 3 AND r = 3 AND used(c, r) = 1 THEN ' mod for free space
used(c, r) = used(c, r) + 1: ct = ct + 1
When this is run, the average number hovers around 42.23 or 42.24, but much closer to 42.23.
Now that I have done these simulations, what do those "many solutions available on the web" have to say?
The "Wizard of Odds" says 41.36857386 for the version with the free center space. "Durango Bill"'s website agrees with the Wizard. Why the discrepancy?
One possibility is that I erred in programming. Another is that the Basic pseudo-random number generator is not random enough. Another is that there is a flaw in the Wizard's and Durango's math.
I don't see a flaw in my logic. I've tried to minimize any flaws in the RNG by throwing in manually tossed in randomizations to the RNG, and these don't seem to affect the results.
I looked over Durango Bill's explanation of the math logic he used and I see a flaw there.
His logic, in summary, is:
The probability of winning exactly at a given number of called numbers is equal to the difference between having filled in a Bingo! at some point up to and including that call (call n), and that of having completed it at some point up to and including the preceding call (n-1).
There are 2^24 = 16,777,215 possible "called positions" on a given board, from nothing filled in except the free space to all 24 called as well as the free space. Some have fewer filled-in spaces than others. Some have one or more Bingo!s (5 in a row) and some have none. Durango counts how many do for any given number of marked (covered) spaces there are on the given board. This way he calculates the probability that if there are n numbers that are actually part of the player's board, that some of them form a Bingo!.
Then, the calculation takes into consideration the probability that for any given number of called numbers, n, a given number of those are found on the player's card somewhere. As an example, if ten numbers have been called, he multiplies the probability that four of them are on the given board by the probability that having four on the board will have (in this case, actually be) a Bingo!; to this he adds the probability that exactly five of the numbers will be on the given board multiplied by the probability that five random positions will give a Bingo!. This continues for all the numbers up to ten, in this case, that might be on the player's board. All these are added up as they are mutually exclusive.
As said above, these constitute the cumulative probability up to n called numbers and their differences constitute the individual probabilities, so that when multiplied by their respective numbers and then added, the expected number of calls is reached.
The problem is that this treats every called number as equally likely to be part of a win for the player, as if it were equally likely to be anywhere on the board that was not previously filled in. But numbers N31 through N45 are each not as likely to be needed by the player as the other numbers, as the player already has a free space in column N, and a very important free space at that, as it helps in four possible Bingo!s.
So, while the free space helps the player as evidenced by my 42.23 for the free-space version vs 43.95 for the non-free-space version, it does not help as much as Durango's or Wizard's calculation would have, due to the increased likelihood of "wasted" balls referencing the N column, where only 4 are helpful out of 15, rather than 5 out of 15. No question about it: the free space helps the player--just not as much as it might. Numbers in the N column are thus more valuable than those in other columns as they are easier to make a win when present, but are slightly harder to come by.
This effect of categorizing the balls by column, rather than having any number appear anywhere on a player's card, also affects, in a more obvious manner, the independence of different cards. This is a more obvious case of Durango's miscalculations: he treats the different cards as independent: for K boards, where Cp[N] is the cumulative probability of a Bingo! on one board after N numbers have been called, he says the probability of at least one of K boards showing a win as 1 - ((1 - Cp[N])^K). This is what it would be if the wins on the different boards were independent; but they are not, even if the numbers themselves on the board are. Why is this so?
As the numbers are called randomly, on some occasions (separate games) the numbers will be spread more evenly across the columns B, I, N, G and O. In other games one or more columns will happen to have more early calls than other columns, and this will be in common for all the players. How will this affect the dependence (as opposed to independence) of the lengths of the games? The more evenly spread the column calls are, the longer it will take for a win on a given card; the more concentrated in given column(s), the fewer balls will be needed on average. And the concentration or even-spread is in common for all the players of the given game.
To take extreme examples, if the caller calls out, in repeated turn, a B number, an I number, etc.: B-I-N-G-O-B-I-N-G-O-B-I-N-G-O-..., simulation shows that it will take about 43.52 called numbers before a given card has a winner, even with the free center square. If however all the balls come from column B, the average is only 13.34. The same is true for columns I, G and O. If all the balls are from column N, it's 12.81, since you need only 4 out of the 15 to get a vertical line. When these different probabilities affect all the participants in the given game, that affects the independence of the different cards.
Simulating multiple cards takes some more complex logic, that slows down the computations, but here is the 5-card version. The array of five boards uses only boards 2 through 5, as the original logic is used for the first board:
DEFDBL A-Z
DIM bd(5, 5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
FOR bdNo = 2 TO 5 ' extra boards
FOR col = 1 TO 5
bdcontains$(bdNo, col) = ","
REDIM usedb(15)
FOR row = 1 TO 5
r = INT(15 * RND(1)) + 1
WHILE usedb(r) = 1
r = INT(15 * RND(1)) + 1
WEND
usedb(r) = 1
bd(bdNo, col, row) = r
bdcontains$(bdNo, col) = bdcontains$(bdNo, col) + LTRIM$(STR$(r)) + ","
NEXT row
NEXT col
NEXT bdNo
good = 0
REDIM used(5, 15): ct = 0
REDIM bdchk(5, 5, 5)
FOR i = 2 TO 5: bdchk(i, 3, 3) = 1: NEXT
used(3, 3) = 1 ' this is mod for free space
DO
c = INT(RND(1) * 5 + 1)
r = INT(RND(1) * 15 + 1)
IF used(c, r) = 0 OR c = 3 AND r = 3 AND used(c, r) = 1 THEN ' mod for free space
used(c, r) = used(c, r) + 1: ct = ct + 1
IF r <= 5 THEN
good = 0
brk = 0
FOR rw = 1 TO 5
IF used(c, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
FOR cl = 1 TO 5
IF used(cl, r) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
IF r = c THEN
FOR rw = 1 TO 5
IF used(rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSEIF r = 6 - c THEN
brk = 0
FOR rw = 1 TO 5
IF used(6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN good = 1
END IF
END IF
END IF
END IF
END IF ' 5 or lower, for main board
FOR bdNo = 2 TO 5
IF INSTR(bdcontains$(bdNo, c), "," + LTRIM$(STR$(r)) + ",") THEN
FOR i = 1 TO 5
IF bd(bdNo, c, i) = r THEN
bdchk(bdNo, c, i) = 1
brk = 0
FOR rw = 1 TO 5
IF bdchk(bdNo, c, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
FOR cl = 1 TO 5
IF bdchk(bdNo, cl, i) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSE
brk = 0
IF i = c THEN
FOR rw = 1 TO 5
IF bdchk(bdNo, rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN
good = 1
ELSEIF i = 6 - c THEN
brk = 0
FOR rw = 1 TO 5
IF bdchk(bdNo, 6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
NEXT
IF brk = 0 THEN good = 1
END IF
END IF
END IF
END IF
END IF
NEXT
END IF
NEXT bdNo
IF good THEN
totct = totct + ct: tottr = tottr + 1
IF trial MOD 10000 = 0 THEN
IF INKEY$ > "" THEN RANDOMIZE TIMER: ELSE PRINT: PRINT: PRINT
PRINT ct; tottr,
PRINT totct / tottr, trial
END IF
EXIT DO
END IF
END IF ' not previously used
LOOP UNTIL good
NEXT trial
The average number of calls to produce the first Bingo! when five cards are used hovers around 30.39. When the free space in the middle is eliminated, this changes to around 32.64.
Summary:
1 card 5 cards
No free space: 43.95 32.62
Free space: 42.23 30.39
|
Posted by Charlie
on 2013-03-07 22:42:19 |