How it works
        Random numbers are the basis for dealing with complex behavior in systems containing large numbers of objects.  Cryptography, games, and many statistical models in mathematics, physical sciences, and social sciences rely on random numbers.  In cryptography, for example, random number generators are often used to construct keys for encryption of data and decryption of data.  Random number generators are also used in computer games in order to better simulate the behavior of computer-controlled characters.  And, of course, random numbers are used extensively in mathematics and the sciences.  The best example is the operation of the Monte Carlo method, which draws extensively on random numbers in order to calculate precise deterministic solutions.

        However, random numbers are not always truly random.  What we would refer to as a “true” random number can only be observed in nature in the form of dice throws or radioactive decay; unfortunately, the creation of random numbers through such means is slow and impractical.  Instead, we often generate sets of numbers which are only pseudorandom, which means – in the long run and over enough trials – a pattern will eventually emerge.

Figure 1: John von Neumann

        In order to study large-scale interactions, we create sequences of numbers that approximate randomness using mathematical algorithms.  These numbers are inherently nonrandom because they are generated by deterministic mathematical processes.  John von Neumann (Fig. 1), one of the first mathematicians to extensively study random numbers, once said “Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin.”  That’s because such approaches to creating random numbers in not truly random.  Hence, they are known as pseudorandom number.

        Different pseudorandom number generators (or PRNGs) approximate different properties of random numbers; thus, any given PRNG may only be used for one specific type of problem. 

         For example, a generator that produces unpredictable but not uniformly distributed number sequences may be useful in cryptography but not in the Monte Carlo method. Now we will examine three different types of random number generators and experiment with each one.

 Activity #1: The Middle-Square method.

        The Middle-Square method –developed by von Neumann for use in high-speed calculations with the first true computer (ENIAC) – is seriously flawed.  But it provides a simple means to examine pseudorandom numbers.  The algorithm that creates these numbers is pretty simple and straightforward:

        1.     Begin with an n-digit seed number.

        2.     Square it to obtain a 2n-digit number, adding a leading zero if necessary.

        3.     Take the middle n digits as the next random number.

        4.     Repeat.

Figure 2: R.A. Fisher

        A table of random numbers would have the unique property that no matter how the sequence was selected (by row, column, diagonal or irregularly), the numbers in the sequence would always be random.  The first such table was published in 1927 and many similar tables were developed around the same time.  These first tables of random numbers were generated through a variety of ways; for example, the British mathematician L.H.C. Tippett took his table of numbers “at random” from census registers while statisticians R.A. Fisher (Fig. 2) and Francis Yates (Fig. 3) used numbers taken “randomly” from logarithm tables.

Figure 3: Francis Yates

        Most mathematicians and scientists preferred to generate their own lists of random numbers, but using more reliable methods than census figures or logarithms (which are not necessarily random).  This leads us to three approaches developed over the past fifty years.

Example:

Consider the following example of using the algorithm.

        1.     4739

        2.     (4739)2 = 22458121

        3.     22458121 → 4581

        4.     (4581)2 = 20985561

        5.     20985561 → 9855

        6.     (9855)2 = 97121025

Figure 4: Example of the Middle-Square Method

        7.     97121025 → 1210

        8.     . . . and continue (with the sequence 4739458198551210...).

a)     Now you try it.  Start with the 4-digit number 8193 and complete six (6) steps.  What are your last four (4) numbers in the sequence?

b)     Let’s try another one.  Start with the 4-digit number 1049 and complete six (6) steps.  What are your last four (4) numbers in the sequence?

In this case, you noticed one of the flaws in the Middle-Square method.  Some seed numbers will produce a string of 0s and you’ll get no more random numbers!  And that isn't all.

c)     Try the 4-digit number 7600.  Complete six (6) steps.  What happens to this sequence?

That’s right; you get the same number over and over again.  So you can see, it’s not a very good random number generator.  But it’s not all that bad. 

d)     Let’s do one more.  Start with the 6-digit number 279473 and complete six (6) steps.  What are your last four (4) numbers in the sequence? 

Activity #2: The Linear Congruence method. 

        Derrick Lehmer (Fig. 5) developed this method in 1951.  It is a much more reliable pseudorandom number generator, although it too cycles over enough time.  But it is an easy PRNG to study and worth looking at.

Figure 5: Derrick Lehmer

       A number of mathematicians were working on random number generators at the same time as Lehmer, and one of the key developments was that of figuring out how to identify whether a sequence of numbers was actually random.

        The first tests for random numbers were developed by Maurice Kendall (Fig. 6) and Bernard Babington Smith in 1938. They were based upon the chi-square test (which was formulated in order to determine whether or not experimental results matched up with predicted theoretical probabilities).

Figure 6: Maurice Kendall

        Kendall and Smith’s four tests were hypothesis tests, which took as the null hypothesis the rule that each number in a given random sequence had an equal chance of occurring.  These tests are:

1. The frequency test. This test simply checked to make sure that there were roughly the same number of 0s, 1s, 2s, 3s, etc.

2. The serial test. This test was a 2-digit version of the frequency test; comparing 2-digit combinations and their observed frequencies with the expectation that they were equally distributed.

3. The poker test. This test looked for specific sequences of five numbers at a time (11111, 11112, 11122, etc).  It’s called the poker test because it  was based on hands in the card game poker.

4. The gap test. This test studied the “distances” – or the number of numbers  found – between 0s (00 would be a distance of 0, 030 would be a distance of 1, 01740 would be a distance of 3, etc.).

        Kendall and Smith assumed that if a sequence of numbers was able to pass all these tests within a reasonable degree of significance (usually around 5%), then it was believed to be locally random.  The two mathematicians separated “local” randomness from “true” randomness by explaining that many sequences generated with true random methods might not display “local” randomness to a given degree; large sequences might actually contain rows of a single digit which would not appear to be random.  This might be “random” on the scale of the entire sequence, but not for a smaller block of numbers.

        As you can see in the previous example, the Middle-Square method fails several of these tests for specific number sequences.  However, Lehmer’s Linear Congruence method does show local randomness, as you will see after working through the next activity.

The method uses the following algorithm:

        1.     Xn+1 = (a∙Xn + b) mod c   [given a seed value of Xo and integers a, b, and c]

        2.     Let Xn = Xn+1 and repeat as necessary.

Example:

Let Xo = 7 and integers a = 1, b = 7, and c = 10.

         1.     X1 = (1∙7 + 7) mod 10 = 4

        2.     X2 = (1∙4 + 7) mod 10 = 1

        3.     X3 = (1∙1 + 7) mod 10 = 8

        4.     X4 = (1∙8 + 7) mod 10 = 5

        5.     X5 = (1∙5 + 7) mod 10 = 2

        6.     X6 = (1∙2 + 7) mod 10 = 9

         7.     X7 = (1∙8 + 7) mod 10 = 5

        8.     . . . and continue (with the sequence 7418529...).

Figure 7: Example of the Linear Congruence Method

        The reason this approach is called the linear congruence method can be readily seen by the diagram (Fig. 7).  A string of pseudorandom points, when plotted on a graph of output values vs. seed values, forms a simple linear function.

In this case, the sequence cycles after every ten terms; feel free to continue to run the algorithm to find when it repeats.

a)     Now you try it.  Let Xo = 4 and integers a = 1, b = 3, and c = 5.  Complete six (6) steps.  What are your last four (4) numbers in the sequence?

b)     Let’s try another one.  Let Xo = 12 and integers a = 3, b = 11, and c = 14.  Complete six (6) steps.  What are your last four (4) numbers in the sequence?

c)     Finally, let’s do a really nasty one.  Let Xo = 87 and integers a = 14, b = 54, and c = 39.  Complete six (6) steps.  What are your last four (4) numbers in the sequence?

Activity #3:     The Mersenne Twister.

        One of the newest PRNG is the Mersenne Twister, originally developed by mathematicians Makoto Matsumoto and Takuji Nishimura in 1997.  This PRNG is now often used in place of linear congruent generators like the example above. The generator runs faster than all but the least statistically sound pseudorandom number generators. It appears to be distributed uniformly in 623 dimensions (!!) and has passed numerous tests for randomness.

        The Mersenne Twister gets its name from its huge period of 219937–1. This number is one of the largest known Mersenne primes.  It is believed that the Mersenne Twister would probably take longer to cycle than the entire future existence of humanity (and, perhaps, the universe).  However, if you observe enough numbers generated by the Mersenne Twister, the generator would allows all future numbers to be predicted; therefore, the Mersenne Twister is not suitable in cryptography.  This illustrates the fact that no single PRNG is the best choice for all applications.

Here’s your last activity with this before moving on to several computer simulations:

        1.     Who was Marin Mersenne?  What is a Mersenne prime?

        2.     Give an example of a sequence of numbers from the Mersenne Twister.

        3.     Give an example of how the Mersenne Twister could be used.

How it works
Running the model
Try it
Find out more
Print and Email

 

 

Nome

University of Notre Dame
Notre Dame, Indiana 46556
Phone: 574-631-5000

Copyright © 2006 University of Notre Dame
Last modified: July 08, 2007 10:44 PM