All about flooble | fun stuff | Get a free chatterbox | Free JavaScript | Avatars
 perplexus dot info

 More randomness! (Posted on 2003-06-30)
Suppose you have a function (or a magic ball) that is capable of producing a totally random integer between 1 and 5 (inclusive).

Using this, how would you generate a random number (also an integer) between 1 and 7 (inclusive)? (Note that the for the number to be random, all integers between 1 and 7 must have an equal chance of being generated)

Assume that using your 1-5 generator is pretty time-consuming, so you want to minimize the number of times you are going to use it.

 See The Solution Submitted by levik Rating: 4.3333 (6 votes)

Comments: ( Back to comment list | You must be logged in to post comments.)
 solution | Comment 6 of 16 |
The best way to consider sequences of 5 possibilities or 7 possibilities is to consider them as base-5 or base-7 numbers. In order to do so, we'll reduce the 5-possibility random digit by 1, to get 0 - 4, and after converting to base-7, increment each digit by 1 to get the desired 1 - 7.

No power of 5 is an integral multiple of a power of 7 because any number is factorable uniquely into primes. So there is always going to be some loss of information. For example, if we take two random numbers from 1-5 (or a two-digit base-5 number), to get 25 possibilities, that is not even enough for a two-digit base-7 number, so if we express the result as a base-5 number, 0-24, and convert to base-7, we'll get, in that base, a number from 0-33 (base 7). If we were to just take the last digit, 0 through 3 would be overrepresented, any such digit appearing 4 times for every 3 times that a digit such as 4 through 6 appears. So we'd have to throw away numbers greater than 26 (base 7), which is 20 in decimal, or 40 in base-5, then take the last base-7 digit.

As the rand5 function, as I'll call it, is expensive to call, we want to minimize losses where we have to throw out a fraction of such results.

Ideally, if there were no losses due to the incommensurability of base-5 and base-7 numbers, we'd get log(5)/log(7) base-7 digits for every base-5 digit. That's about .8271, or about 83% as many random digits from 0-6 (or 1-7) as from 0-4 (or 1-5). That's because a base-7 digit has more information than a base-5 number.

So there are two items we need to take into consideration in deciding how many base-5 digits we should group together: the number of base-7 digits we get for every group of n base-5 digits, and how many whole such groups we must throw away to avoid a bias in favor of some digits. We saw above that if we use just two base-5 digits, we only get one base-7 digit, but we also must throw away 4/25 or 16% of our entire throughput, so we actually get only .5*.84 = 42% as many base-7 digits as we require base-5 digits.

The following table shows the results for various choices in the grouping of base-5 digits:
```
2  1              25               7              21 0.84000 0.42000

3  2             125              49              98 0.78400 0.52267

4  3             625             343             343 0.54880 0.41160

5  4            3125            2401            2401 0.76832 0.61466

6  4           15625            2401           14406 0.92198 0.61466

7  5           78125           16807           67228 0.86052 0.61466

8  6          390625          117649          352947 0.90354 0.67766

9  7         1953125          823543         1647086 0.84331 0.65591

10  8         9765625         5764801         5764801 0.59032 0.47225

11  9        48828125        40353607        40353607 0.82644 0.67618

12  9       244140625        40353607       242121642 0.99173 0.74380

13 10      1220703125       282475249      1129900996 0.92561 0.71201

14 11      6103515625      1977326743      5931980229 0.97190 0.76363

15 12     30517578125     13841287201     27682574402 0.90710 0.72568

16 13    152587890625     96889010407     96889010407 0.63497 0.51591

17 14    762939453125    678223072849    678223072849 0.88896 0.73209

18 14   3814697265625    678223072849   3391115364245 0.88896 0.69141

19 15  19073486328125   4747561509943  18990246039772 0.99564 0.78603

20 16  95367431640625  33232930569601  66465861139202 0.69695 0.55756

21 17 476837158203125 232630513987207 465261027974414 0.97572 0.78987

```

In the various columns:

The number of base-5 digits that go in, the resulting number of base-7 digits that result, five raised to the given power, and seven raised to the power of the number of base-7 digits, followed by the largest integral multiple of that power of 7 that is not greater than the power of 5 shown. The fraction of that over the number that is the power of 5 is the portion of throughput that need not be thrown away--we just throw away the first base-7 digit--is shown next. Finally is shown the ratio of base-7 to base-5 digits multiplied by the fraction kept.

In the two-base-5-digit example, we saw we can get only one base-7 digit, as there are only 25 possible two-base-5-digit results. Then 7^1 is of course 7, but three times this (or 21) will fit into 25, so it doesn't matter we throw away the first base-7 digit (0-2). But as a result we keep only 21/25 of our throughput, or 84%, and 84% of 50% is the 42% we figured earlier.

If we choose to accumulate 21 base-5 digits, we'll get 17 base-7 digits, and as you can see from the last column, get almost 79% as many base-7 digits as base-5 go in, not too far from the theoretic 82.7% of the ideal case. Going to 23 base-5 digits would give us 78.991% instead of the 78.987 we get at 21, but accumulating such large numbers would exceed the precision available in the language I used.

The subroutine is:
```
DEFDBL A-Z

FUNCTION rand7

STATIC seed

IF seed = 0 THEN

DO

seed = 0

FOR i = 1 TO 21

seed = seed * 5 + rand5 - 1

NEXT

LOOP WHILE seed > 465261027974413#

DO WHILE seed > 232630513987206#

seed = seed - 232630513987207#

LOOP

END IF

q = INT(seed / 7)

r = seed - q * 7 + 1

seed = q

rand7 = r

END FUNCTION

```

Note the STATIC seed.

It keeps this value--the built base-5 number (actually stored in the internal binary format of the computer, but conceptually just a number)--from call to call to this function, until it has been depleted. The first DO...LOOP WHILE throws away seeds greater than 2*7^21-1 (the subtraction of one as we're zero-based). The second DO WHILE...LOOP in effect gets rid of the first base-7 digit.
 Posted by Charlie on 2003-06-30 08:52:22

 Search: Search body:
Forums (0)