|
Random numbers are the basis for dealing with complex behavior in
systems containing large numbers of objects. Cryptography, games, and many
statistical models in mathematics, physical sciences, and social sciences
rely on random numbers. In cryptography, for example,
random number
generators are often used to construct
keys for encryption of data and decryption of data. Random number
generators are also used in computer games in order to better simulate the
behavior of computer-controlled characters. And, of course, random numbers
are used extensively in mathematics and the sciences. The best example is
the operation of the Monte
Carlo
method, which draws extensively on random numbers in order to calculate
precise deterministic solutions.
However, random numbers are not
always truly random. What we would refer to as a “true” random number can
only be observed in nature in the form of dice throws or radioactive decay;
unfortunately, the creation of random numbers through such means is slow and
impractical. Instead, we often generate sets of numbers which are only
pseudorandom, which means – in the long
run and over enough trials – a pattern will eventually emerge.
 |
| Figure 1: John von Neumann |
|
In order to study large-scale
interactions, we create sequences of numbers that approximate randomness
using mathematical algorithms. These numbers are inherently nonrandom
because they are generated by
deterministic
mathematical processes. John von Neumann (Fig. 1), one of the first mathematicians
to extensively study random numbers, once said “Anyone who considers
arithmetical methods of producing random digits is, of course, in a state of
sin.” That’s because such approaches to creating random numbers in not
truly random. Hence, they are known as pseudorandom number.
Different pseudorandom number
generators (or PRNGs) approximate different properties of random numbers;
thus, any given PRNG may only be used for one specific type of problem. |
For example, a generator that produces unpredictable but not uniformly
distributed number sequences may be useful in cryptography but not in the
Monte Carlo method. Now we will examine three
different types of random number generators and experiment with each one.
Activity #1: The
Middle-Square method.
The Middle-Square method
–developed by von Neumann for use in high-speed calculations with the first
true computer (ENIAC) – is seriously flawed. But it provides a simple means
to examine pseudorandom numbers. The algorithm that creates these numbers
is pretty simple and straightforward:
1.
Begin with an n-digit
seed number.
2. Square it to obtain a 2n-digit
number, adding a leading zero if necessary.
3. Take the middle n
digits as the next random number.
4. Repeat.
 |
| Figure 2: R.A. Fisher |
|
A table of random numbers would
have the unique property that no matter how the sequence was selected (by
row, column, diagonal or irregularly), the numbers in the sequence would
always be random. The first such table was published in 1927 and many
similar tables were developed around the same time. These first tables of
random numbers were generated through a variety of ways; for example, the
British mathematician L.H.C. Tippett took his table of numbers “at random”
from census registers while statisticians R.A. Fisher (Fig. 2) and
Francis Yates (Fig. 3) used numbers taken “randomly” from logarithm
tables. |
 |
| Figure 3: Francis Yates |
|
Most
mathematicians and scientists preferred to generate their own lists of
random numbers, but using more reliable methods than census figures or
logarithms (which are not necessarily random). This leads us to three
approaches developed over the past fifty years.
Example:
Consider the following example of
using the algorithm.
|
1. 4739 2. (4739)2 =
22458121
3. 22458121 → 4581
4. (4581)2 =
20985561
5. 20985561 → 9855
6. (9855)2 =
97121025
|
 |
| Figure 4: Example of the Middle-Square Method |
|
7. 97121025 → 1210
8. . . . and continue
(with the sequence 4739458198551210...).
a) Now you try it. Start with the
4-digit number 8193 and complete six (6) steps. What are your last four (4)
numbers in the sequence?
b) Let’s try another one. Start
with the 4-digit number 1049 and complete six (6) steps. What are your last
four (4) numbers in the sequence?
In this case, you noticed one of the
flaws in the Middle-Square method. Some seed numbers will produce a string
of 0s and you’ll get no more random numbers! And that isn't all.
c) Try the 4-digit number 7600.
Complete six (6) steps. What happens to this sequence?
That’s right; you get the same number
over and over again. So you can see, it’s not a very good random number
generator. But it’s not all that bad.
d) Let’s do one more. Start with
the 6-digit number 279473 and complete six (6) steps. What are your last
four (4) numbers in the sequence?
Activity #2: The Linear
Congruence method.
Derrick Lehmer
(Fig. 5) developed this
method in 1951. It is a much more reliable pseudorandom number generator,
although it too cycles over enough time. But it is an easy PRNG to study
and worth looking at.
 |
| Figure 5:
Derrick Lehmer |
|
A number of
mathematicians were working on random number generators at the same
time as Lehmer, and one of the key developments was that of figuring
out how to identify whether a sequence of numbers was actually
random.
The first tests for
random numbers were developed by Maurice Kendall (Fig. 6) and
Bernard Babington Smith in 1938. They were based upon the chi-square
test (which was formulated in order to determine whether or not
experimental results matched up with predicted theoretical
probabilities). |
 |
| Figure 6: Maurice Kendall |
|
Kendall and Smith’s four tests were
hypothesis tests, which took as the
null
hypothesis the rule that each number in a given random
sequence had an equal chance of occurring. These tests are:
1. The frequency
test. This test simply checked to make sure that there were
roughly the same number of 0s, 1s, 2s, 3s, etc.
2. The serial test.
This test was a 2-digit version of the frequency test; comparing
2-digit combinations and their observed frequencies with the expectation
that they were equally distributed.
3. The poker test. This
test looked for specific sequences of five numbers at a time (11111,
11112, 11122, etc). It’s called the poker test because it was based on
hands in the card game poker.
4. The gap test. This
test studied the “distances” – or the number of numbers found – between
0s (00 would be a distance of 0, 030 would be a distance of 1, 01740
would be a distance of 3, etc.).
Kendall and Smith assumed that if a sequence of
numbers was able to pass all these tests within a reasonable degree of
significance (usually around 5%), then it was believed to be
locally random. The two mathematicians separated “local”
randomness from “true” randomness by explaining that many sequences
generated with true random methods might not display “local” randomness to a
given degree; large sequences might actually contain rows of a single digit
which would not appear to be random. This might be “random” on the scale of
the entire sequence, but not for a smaller block of numbers.
As you can see in the previous example, the
Middle-Square method fails several of these tests for specific
number sequences. However, Lehmer’s Linear Congruence method
does show local randomness, as you will see after working through the next
activity.
The method uses the following
algorithm:
1. Xn+1 = (a∙Xn
+ b) mod c [given a seed value of Xo and integers
a, b, and c]
2. Let Xn = Xn+1
and repeat as necessary.
Example:
Let Xo = 7 and integers
a = 1, b = 7, and c = 10.
|
1. X1 = (1∙7 +
7) mod 10 = 4 2. X2 = (1∙4 +
7) mod 10 = 1
3. X3 = (1∙1 +
7) mod 10 = 8
4.
X4 = (1∙8 +
7) mod 10 = 5
5. X5 = (1∙5 +
7) mod 10 = 2
6. X6 = (1∙2 +
7) mod 10 = 9
7. X7 = (1∙8 +
7) mod 10 = 5
8. . . . and continue
(with the sequence 7418529...).
|
 |
| Figure 7: Example of the
Linear Congruence Method |
|
The reason this approach is called the linear
congruence method can be readily seen by the diagram (Fig. 7). A
string of pseudorandom points, when plotted on a graph of output values vs.
seed values, forms a simple linear function.
In this case, the sequence cycles after every ten terms; feel free to
continue to run the algorithm to find when it repeats.
a) Now you try it. Let Xo
= 4 and integers a = 1, b = 3, and c = 5. Complete six
(6) steps. What are your last four (4) numbers in the sequence?
b) Let’s try another one. Let Xo
= 12 and integers a = 3, b = 11, and c = 14. Complete
six (6) steps. What are your last four (4) numbers in the sequence?
c) Finally, let’s do a really
nasty one. Let Xo = 87 and integers a = 14, b =
54, and c = 39. Complete six (6) steps. What are your last four (4)
numbers in the sequence?
Activity #3: The Mersenne
Twister.
One of the newest PRNG is the
Mersenne Twister, originally developed by mathematicians Makoto Matsumoto
and Takuji Nishimura in 1997. This PRNG is now often used in place of
linear congruent generators like the example
above. The generator runs faster than all but the least statistically sound
pseudorandom number generators. It appears to be distributed uniformly in
623 dimensions (!!) and has passed numerous tests for randomness.
The Mersenne Twister gets its
name from its huge period of 219937–1. This number is one of the
largest known Mersenne primes. It is believed that the Mersenne Twister would
probably take longer to cycle than the entire future existence of humanity
(and, perhaps, the universe). However, if you observe enough numbers
generated by the Mersenne Twister, the generator would allows all future
numbers to be predicted; therefore, the Mersenne Twister is not suitable in
cryptography. This illustrates the fact that no single PRNG is the best
choice for all applications.
Here’s your last activity with this
before moving on to several computer simulations:
1. Who was Marin Mersenne?
What is a Mersenne prime?
2. Give an example of a
sequence of numbers from the Mersenne Twister.
3. Give an example of how
the Mersenne Twister could be used.
|