In doing computer simulations, such as
the one I wrote for simulating the results of part 2 of Rumor Mill, one often uses a random number generator that's built into the computer language. These generators are based on a seed (some arbitrary number) and are mixed up at each step that calls for a random number. There are only a finite number of seeds, and each one is based mechanically on the previous, so they necessarily repeat after a while.
The seed is kept internally, away from the programmer's view, so the programmer can't ask for, say, the next random number after .753372.
If one suspects that the repetition cycle is actually occuring within the length of the run that he needs, what algorithm can you put into the program to find the period with which your results are repeating (and are therefore no longer random, or rather no longer independent trials)? Assume you do not have room to store all the numbers as they arrive, nor can you afford the time it would take to compare each new number to all the preceding numbers.
Then also, how do you determine where the repetition cycle begins (after what iteration of the loop of trials).
Let's assume that the length of your run is n trials.
Write
a separate program that initializes the generator (seeds it) the same
way your actual simulation program will initialize it, and call the
generator n times, saving only the final result. Then code a for-next
loop, iterating from 1 to n, that asks for the next random number from
the generator looking for the saved number to re-appear. If it
does, the period is simply your for-next loop counter. If it
doesn't, you can safely conclude that the period of repetition is
greater than n.
Assuming that we now know the period, p, of
the generator, I think the only way to answer the 2nd question is to
create TWO instances of the generator -- call them G1 and
G2. Initialize both with the same seed. Ask for p
random numbers from G1, and discard them. Now ask for random
numbers from both G1 and G2 and compare them, keeping count. As
soon as the two numbers from each instance are equal, the count is the
starting point of your repetition cycle.
Not all programming
languages may allow you to use separate instances of the
generator. In this case, I think it's a much longer
process, using multiple iterations storing sets of consecutive random
numbers in an array during each iteration, and re-seeding the generator
after each iteration. The number of iterations required would
depend on p, and the size of the array. I'd be interested if
someone knows how to do this part more efficiently.
Edited on February 12, 2005, 6:16 pm