All about flooble | fun stuff | Get a free chatterbox | Free JavaScript | Avatars    
perplexus dot info

Home > Probability
BINGO! (Posted on 2013-03-07) Difficulty: 3 of 5
The game BINGO! (US and Canada version)is played with a card on which is drawn a 5x5 grid filled with 25 random, non-repeating integers from the range of 1 to 75 inclusive (1 number per grid location). Further constraints are that the left-most column can only contain values from 1 to 15 inclusive, the next left-most column only 16 to 30 inclusive, etc., ending with the 5th (right-most) column only containing 5 random values from the range 61 to 75 inclusive. To play, numbers are "called", one at a time and randomly, with the winner being the first to have all values in any single row, column or diagonal on his/her card "called" (thus, a BINGO!).

Without referring to the many solutions available on the web (BINGO! is very popular):

1) What is the expected value of the number of numbers that must be "called" to reach a BINGO! on a single card?

2) What if you are allowed to play 5 cards (presumably all different)?

Also, usually the center square of the grid is "free", i.e. it is assumed to be called already at the beginning of the game.

3) What are the resulting values for Questions 1) and 2) in this case?

Extra hard bonus:

There are many many variations of the game that allow changes to the pattern of numbers/grid spaces that must be "called" to reach a BINGO! A "+" and an "X" are two of these. Both require 8 numbers to be called, assuming there is a free space in the middle. What are the expected values of the number of numbers "called" for each of these?

The author admits that computer solutions are very viable to solve this problem, but BINGO! existed long before computers. Any analytical attempts/solutions get bonus points!

No Solution Yet Submitted by Kenny M    
Rating: 4.0000 (1 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)
Some Thoughts Simulation results | Comment 1 of 4

A pure mathematical analysis is daunting to say the least (more on this later), so I did a simulation:

The first version is without a free space in the middle of the board.

As any set of numbers is as good as any other, it uses the lowest five numbers in each column and each set is numbered 1 through 5, so for example B16 is column 2, row 1 and is identified by (2,1) in the randomization of chosen balls.

DEFDBL A-Z
DIM bd(5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
    good = 0
    REDIM used(5, 15): ct = 0
    DO
        c = INT(RND(1) * 5 + 1)
        r = INT(RND(1) * 15 + 1)
        IF used(c, r) = 0 THEN
            used(c, r) = used(c, r) + 1: ct = ct + 1
            IF r <= 5 THEN
                good = 0
                brk = 0
                FOR rw = 1 TO 5
                    IF used(c, rw) = 0 THEN brk = 1: EXIT FOR
                NEXT
                IF brk = 0 THEN
                    good = 1
                ELSE
                    brk = 0
                    FOR cl = 1 TO 5
                        IF used(cl, r) = 0 THEN brk = 1: EXIT FOR
                    NEXT
                    IF brk = 0 THEN
                        good = 1
                    ELSE
                        brk = 0
                        IF r = c THEN
                            FOR rw = 1 TO 5
                                IF used(rw, rw) = 0 THEN brk = 1: EXIT FOR
                            NEXT
                            IF brk = 0 THEN
                                good = 1
                            ELSEIF r = 6 - c THEN
                                brk = 0
                                FOR rw = 1 TO 5
                                    IF used(6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
                                NEXT
                                IF brk = 0 THEN good = 1
                            END IF
                        END IF
                    END IF
                END IF
                IF good THEN
                    totct = totct + ct: tottr = tottr + 1
                    IF trial MOD 10000 = 0 THEN
                        IF INKEY$ > "" THEN RANDOMIZE TIMER: ELSE PRINT: PRINT: PRINT
                        PRINT ct; tottr,
                        PRINT totct / tottr, trial
                    END IF
                    EXIT DO
                END IF
            END IF
        END IF
    LOOP UNTIL good
NEXT trial

When run, the average number of balls called before Bingo! hovers around 43.95 and 43.96.

To take into consideration the free spot in the center, (3,3) is prepopulated. But to account for the fact that there are still 15 N balls, not just 14, that could be called, when the ball chosen is (3,3), its previous non-use is marked by the Used counter being 1, rather than the zero of the other positions. (Balls can be anywhere from (c,1) through (c,15) but only (c,1) through (c,5) are on the board.)

Only the beginning of the program changes, in fact, just two lines as marked by the comments below:

DEFDBL A-Z
DIM bd(5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
    good = 0
    REDIM used(5, 15): ct = 0
    used(3, 3) = 1 ' this is mod for free space
    DO
        c = INT(RND(1) * 5 + 1)
        r = INT(RND(1) * 15 + 1)
        IF used(c, r) = 0 OR c = 3 AND r = 3 AND used(c, r) = 1 THEN ' mod for free space
            used(c, r) = used(c, r) + 1: ct = ct + 1

When this is run, the average number hovers around 42.23 or 42.24, but much closer to 42.23.

Now that I have done these simulations, what do those "many solutions available on the web" have to say?

The "Wizard of Odds" says 41.36857386 for the version with the free center space. "Durango Bill"'s website agrees with the Wizard. Why the discrepancy?

One possibility is that I erred in programming. Another is that the Basic pseudo-random number generator is not random enough. Another is that there is a flaw in the Wizard's and Durango's math.

I don't see a flaw in my logic. I've tried to minimize any flaws in the RNG by throwing in manually tossed in randomizations to the RNG, and these don't seem to affect the results.

I looked over Durango Bill's explanation of the math logic he used and I see a flaw there.

His logic, in summary, is:

The probability of winning exactly at a given number of called numbers is equal to the difference between having filled in a Bingo! at some point up to and including that call (call n), and that of having completed it at some point up to and including the preceding call (n-1).

There are 2^24 = 16,777,215 possible "called positions" on a given board, from nothing filled in except the free space to all 24 called as well as the free space. Some have fewer filled-in spaces than others. Some have one or more Bingo!s (5 in a row) and some have none. Durango counts how many do for any given number of marked (covered) spaces there are on the given board. This way he calculates the probability that if there are n numbers that are actually part of the player's board, that some of them form a Bingo!.

Then, the calculation takes into consideration the probability that for any given number of called numbers, n, a given number of those are found on the player's card somewhere. As an example, if ten numbers have been called, he multiplies the probability that four of them are on the given board by the probability that having four on the board will have (in this case, actually be) a Bingo!; to this he adds the probability that exactly five of the numbers will be on the given board multiplied by the probability that five random positions will give a Bingo!. This continues for all the numbers up to ten, in this case, that might be on the player's board. All these are added up as they are mutually exclusive.

As said above, these constitute the cumulative probability up to n called numbers and their differences constitute the individual probabilities, so that when multiplied by their respective numbers and then added, the expected number of calls is reached.

The problem is that this treats every called number as equally likely to be part of a win for the player, as if it were equally likely to be anywhere on the board that was not previously filled in. But numbers N31 through N45 are each not as likely to be needed by the player as the other numbers, as the player already has a free space in column N, and a very important free space at that, as it helps in four possible Bingo!s.

So, while the free space helps the player as evidenced by my 42.23 for the free-space version vs 43.95 for the non-free-space version, it does not help as much as Durango's or Wizard's calculation would have, due to the increased likelihood of "wasted" balls referencing the N column, where only 4 are helpful out of 15, rather than 5 out of 15. No question about it: the free space helps the player--just not as much as it might. Numbers in the N column are thus more valuable than those in other columns as they are easier to make a win when present, but are slightly harder to come by.

This effect of categorizing the balls by column, rather than having any number appear anywhere on a player's card, also affects, in a more obvious manner, the independence of different cards. This is a more obvious case of Durango's miscalculations: he treats the different cards as independent: for K boards, where Cp[N] is the cumulative probability of a Bingo! on one board after N numbers have been called, he says the probability of at least one of K boards showing a win as 1 - ((1 - Cp[N])^K). This is what it would be if the wins on the different boards were independent; but they are not, even if the numbers themselves on the board are. Why is this so?

As the numbers are called randomly, on some occasions (separate games) the numbers will be spread more evenly across the columns B, I, N, G and O. In other games one or more columns will happen to have more early calls than other columns, and this will be in common for all the players. How will this affect the dependence (as opposed to independence) of the lengths of the games? The more evenly spread the column calls are, the longer it will take for a win on a given card; the more concentrated in given column(s), the fewer balls will be needed on average. And the concentration or even-spread is in common for all the players of the given game.

To take extreme examples, if the caller calls out, in repeated turn, a B number, an I number, etc.: B-I-N-G-O-B-I-N-G-O-B-I-N-G-O-..., simulation shows that it will take about 43.52 called numbers before a given card has a winner, even with the free center square. If however all the balls come from column B, the average is only 13.34. The same is true for columns I, G and O. If all the balls are from column N, it's 12.81, since you need only 4 out of the 15 to get a vertical line. When these different probabilities affect all the participants in the given game, that affects the independence of the different cards.

Simulating multiple cards takes some more complex logic, that slows down the computations, but here is the 5-card version. The array of five boards uses only boards 2 through 5, as the original logic is used for the first board:

DEFDBL A-Z
DIM bd(5, 5, 5)
RANDOMIZE TIMER
FOR trial = 1 TO 10000000
    FOR bdNo = 2 TO 5 ' extra boards
        FOR col = 1 TO 5
            bdcontains$(bdNo, col) = ","
            REDIM usedb(15)
            FOR row = 1 TO 5
                r = INT(15 * RND(1)) + 1
                WHILE usedb(r) = 1
                    r = INT(15 * RND(1)) + 1
                WEND
                usedb(r) = 1
                bd(bdNo, col, row) = r
                bdcontains$(bdNo, col) = bdcontains$(bdNo, col) + LTRIM$(STR$(r)) + ","
            NEXT row
        NEXT col
    NEXT bdNo
    good = 0
    REDIM used(5, 15): ct = 0
    REDIM bdchk(5, 5, 5)
    FOR i = 2 TO 5: bdchk(i, 3, 3) = 1: NEXT
    used(3, 3) = 1 ' this is mod for free space
    DO
        c = INT(RND(1) * 5 + 1)
        r = INT(RND(1) * 15 + 1)
        IF used(c, r) = 0 OR c = 3 AND r = 3 AND used(c, r) = 1 THEN ' mod for free space
            used(c, r) = used(c, r) + 1: ct = ct + 1
            IF r <= 5 THEN
                good = 0
                brk = 0
                FOR rw = 1 TO 5
                    IF used(c, rw) = 0 THEN brk = 1: EXIT FOR
                NEXT
                IF brk = 0 THEN
                    good = 1
                ELSE
                    brk = 0
                    FOR cl = 1 TO 5
                        IF used(cl, r) = 0 THEN brk = 1: EXIT FOR
                    NEXT
                    IF brk = 0 THEN
                        good = 1
                    ELSE
                        brk = 0
                        IF r = c THEN
                            FOR rw = 1 TO 5
                                IF used(rw, rw) = 0 THEN brk = 1: EXIT FOR
                            NEXT
                            IF brk = 0 THEN
                                good = 1
                            ELSEIF r = 6 - c THEN
                                brk = 0
                                FOR rw = 1 TO 5
                                    IF used(6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
                                NEXT
                                IF brk = 0 THEN good = 1
                            END IF
                        END IF
                    END IF
                END IF
            END IF ' 5 or lower, for main board
            FOR bdNo = 2 TO 5
                IF INSTR(bdcontains$(bdNo, c), "," + LTRIM$(STR$(r)) + ",") THEN
                    FOR i = 1 TO 5
                        IF bd(bdNo, c, i) = r THEN
                            bdchk(bdNo, c, i) = 1
                            brk = 0
                            FOR rw = 1 TO 5
                                IF bdchk(bdNo, c, rw) = 0 THEN brk = 1: EXIT FOR
                            NEXT
                            IF brk = 0 THEN
                                good = 1
                            ELSE
                                brk = 0
                                FOR cl = 1 TO 5
                                    IF bdchk(bdNo, cl, i) = 0 THEN brk = 1: EXIT FOR
                                NEXT
                                IF brk = 0 THEN
                                    good = 1
                                ELSE
                                    brk = 0
                                    IF i = c THEN
                                        FOR rw = 1 TO 5
                                            IF bdchk(bdNo, rw, rw) = 0 THEN brk = 1: EXIT FOR
                                        NEXT
                                        IF brk = 0 THEN
                                            good = 1
                                        ELSEIF i = 6 - c THEN
                                            brk = 0
                                            FOR rw = 1 TO 5
                                                IF bdchk(bdNo, 6 - rw, rw) = 0 THEN brk = 1: EXIT FOR
                                            NEXT
                                            IF brk = 0 THEN good = 1
                                        END IF
                                    END IF
                                END IF
                            END IF
                        END IF

                    NEXT
                END IF
            NEXT bdNo

            IF good THEN
                totct = totct + ct: tottr = tottr + 1
                IF trial MOD 10000 = 0 THEN
                    IF INKEY$ > "" THEN RANDOMIZE TIMER: ELSE PRINT: PRINT: PRINT
                    PRINT ct; tottr,
                    PRINT totct / tottr, trial
                END IF
                EXIT DO
            END IF

        END IF ' not previously used
    LOOP UNTIL good
NEXT trial


The average number of calls to produce the first Bingo! when five cards are used hovers around 30.39. When the free space in the middle is eliminated, this changes to around 32.64.

Summary:
                  1 card    5 cards
No free space:     43.95     32.62
Free space:        42.23     30.39

  Posted by Charlie on 2013-03-07 22:42:19
Please log in:
Login:
Password:
Remember me:
Sign up! | Forgot password


Search:
Search body:
Forums (0)
Newest Problems
Random Problem
FAQ | About This Site
Site Statistics
New Comments (0)
Unsolved Problems
Top Rated Problems
This month's top
Most Commented On

Chatterbox:
Copyright © 2002 - 2024 by Animus Pactum Consulting. All rights reserved. Privacy Information