In doing computer simulations, such as
the one I wrote for simulating the results of part 2 of Rumor Mill, one often uses a random number generator that's built into the computer language. These generators are based on a seed (some arbitrary number) and are mixed up at each step that calls for a random number. There are only a finite number of seeds, and each one is based mechanically on the previous, so they necessarily repeat after a while.
The seed is kept internally, away from the programmer's view, so the programmer can't ask for, say, the next random number after .753372.
If one suspects that the repetition cycle is actually occuring within the length of the run that he needs, what algorithm can you put into the program to find the period with which your results are repeating (and are therefore no longer random, or rather no longer independent trials)? Assume you do not have room to store all the numbers as they arrive, nor can you afford the time it would take to compare each new number to all the preceding numbers.
Then also, how do you determine where the repetition cycle begins (after what iteration of the loop of trials).
Obviously, it would be inefficient to attempt to store all the numbers. We can't just compare one number (or even several) since in theory, a few numbers can repeat without repeating the entire sequence.
If we create a circular buffer, we can determine where the sequence begins to repeat with a very high probability. We need to determine a number of comparisons and store all the original numbers. For example, if we wanted a relatively irregular comparison sequence, we would store the following values:
Iterations 1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024
Of course, these can be deviated independently. Whenever the initial value is encountered, we check the 2nd value. If that matches, we check the 4th after that, then the 8th, and so on. If all 11 numbers in that sequence match, it's extremely likely that the sequence has repeated. This should work even if the initial number appears in the first 1024 numbers, and it may work even if the cycle repeats before 1024 numbers.
I think even with this case that there is some infinitesimal probability that all 11 comparisons could match up without the sequence actually repeating. All we have to do to determine where the sequence repeats is keep a running counter and store only the values where iteration 1 is matched. When the chain is broken (think Weakest Link) we ignore and/or overwrite that value.
|
Posted by Eric
on 2005-02-12 15:53:50 |