Pseudorandom number generator
A pseudorandom number generator (PRNG) is an algorithm which when run generates a sequence of numbers, the elements of which are approximately independent.
The outputs of pseudorandom number generators are not random---they only approximate some of the properties of random numbers. John von Neumann observed in 1951 that "Anyone who considers arithmetical methods of producing random digits is, of course, in a state of sin". (For attempts to generate "truly random" numbers, see the article on hardware random number generators.) Nevertheless, pseudorandom numbers are a critical part of modern computing, from cryptography to the Monte Carlo method for simulating physical systems. Careful mathematical analysis is required to ensure that the generated numbers are sufficiently "random;" as Robert R. Coveyou of Oak Ridge National Laboratory once remarked, "The generation of random numbers is too important to be left to chance."
Most such algorithms attempt to produce samples that are uniformly distributed. Common classes of algorithms are Linear Congruential Generators, Lagged Fibonacci Generators, linear feedback shift registers and generalised feedback shift registers.
Recent instances of algorithms include Blum Blum Shub and the Mersenne Twister.
Because a PRNG is a deterministic algorithm, its output has certain properties that a true random sequence would not exhibit. One of these is guaranteed periodicity---it is certain that if the generator uses only a fixed amount of memory, then given a sufficient number of iterations, the generator will revisit the same internal state twice, after which it will repeat forever. A generator that isn't periodic can be designed, but its memory requirements would slowly grow as it ran. In addition, a PRNG can be started from an arbitrary starting point, or seed state, and will always produce an identical sequence from that point on.
In practice, many PRNGs exhibit artifacts which can cause them to fail statistical significance tests. These include:
- Shorter than expected periods for some seed states (see Linear Congruential Generators)
- Poor dimensional distribution (see Linear Congruential Generators)
- Successive values may not be independent (see Linear Congruential Generators)
- Some bits are more random than others (see Linear Congruential Generators)
- Lack of uniformity
The recent invention of the Mersenne Twister algorithm, by Makoto Matsumoto and Takuji Nishimura in 1997, avoids most of these problems. It has a colossal period of 219937-1 iterations, is proven to be equidistributed in 623 dimensions (for 32-bit values), and runs faster than all but the least statistically desirable generators. It is now becoming increasingly accepted as the random number generator of choice for all statistical simulations and generative modeling.
However, it is possible to efficiently analyze the output of the Mersenne Twister and recognize the numbers as being non-random (see the Berlekamp-Massey algorithm). A PRNG that seems to avoid this problem is called a cryptographically secure PRNG (CSPRNG).
There are a number of examples of CSPRNGs. Blum Blum Shub has the strongest security proofs, though it is slow. Most stream ciphers work by generating a pseudorandom stream of bits that are XORed with the message; this stream can be used as a good CSPRNG (thought not always: see RC4). A secure block cipher can also be converted into a CSPRNG by running it in counter mode. This is done by choosing an arbitrary key and encrypting a zero, then encrypting a 1, then encrypting a 2, etc. The counter can also be started at an arbitrary number other than zero. Obviously, the period will be 2n for an n-bit block cipher. A cryptographically secure hash of a counter might also act as a good CSPRNG in some cases. If the counter is a bignum, then the CSPRNG could have an infinite period. Finally, there are PRNGs that have been designed to be cryptographically secure. One example is ISAAC (based on a variant of RC4) which is fast and has an expected cycle length of 28295 and for which no successful attack in a reasonable time has yet been found.
References:
- Donald E. Knuth, The Art of Computer Programming, Volume 2: Seminumerical Algorithms, 3rd edition (Addison-Wesley, Boston, 1998).
- The GNU Scientific Library, http://www.gnu.org/software/gsl/. A free (GPL) C library that includes a number of PRNG algorithms.