Institute of Mathematical Statistics Statistical Science

Pseudorandom Numbers
Author(s): Jeffrey C. Lagarias

Source: Statistical Science, Vol. 8, No. 1, Report from the Committee on Applied and
Theoretical Statistics of the National Research Council on Probability and Algorithms
(Feb., 1993), pp. 31-39
Published by: Institute of Mathematical Statistics
Stable URL: https://www.jstor.org/stable/2246038
Accessed: 11-03-2020 12:49 UTC
REFERENCES
Linked references are available on JSTOR for this article:
https://www.jstor.org/stable/2246038?seq=1&cid=pdf-reference#references_tab_contents
You may need to log in to JSTOR to access the linked references.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
https://about.jstor.org/terms
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and

extend access to Statistical Science
This content downloaded from 58.27.226.232 on Wed, 11 Mar 2020 12:49:01 UTC
All use subject to https://about.jstor.org/terms
Statistical Sciernce
1993, Vol. 8, No. 1, 31-39
Pseudorandom Numbers
Jeffrey C. Lagarias
Abstract. This article surveys the problem of generating pseudorandom

numbers and lists many of the known constructions of pseudorandom
bits. It outlines the subject of computational information theory. In this
theory the fundamental object is a secure pseudorandom bit generator.
Such generators are not theoretically proved to exist, although functions
are known that appear to possess the required properties. In any case,
pseudorandom number generators are known that work reasonably well
in practice.
Key words and phrases: Computational information theory, cryptogra-

phy, data encryption standard, discrete exponentiation, multiplicative
congruential generator, one-way function, private key cryptosystem,
pseudorandom numbers, RSA public key cryptosystem.
1. INTRODUCTION It is possible to simulate samples of any reasonable

A basic ingredient needed in any algorithm using distribution using as input a sequence of i.i.d. 0-1
randomization is a source of "random" bits. In practice, valued random variables (Devroye, 1986; and Knuth
in probabilistic algorithms or Monte Carlo simulations, and Yao, 1976). Hence the problem of constructing
one uses instead "random-looking" bits. A pseudoran- pseudorandom numbers is in principle reducible to that
dom bit generator is a deterministic method to produce of constructing pseudorandom bits.
from a small set of "random" bits called the seed a Section 2 describes explicitly some pseudorandom
larger set of random-looking bits called pseudorandom bit generators, as well as some general principles for
bits. constructing pseudorandom bit generators. Most of
There are several reasons for using pseudorandom these generators have underlying group-theoretic or
number-theoretic structure.
bits. First, truly random bits are hard to come by.
Physical sources of supposedly random bits, which Section 3 describes the subject of computational
rely on chaotic, dissipative processes such as varactor information theory and indicates its connection to cryp-
diodes, typically produce correlated bits rather than tography. The basic object in this theory is the concept
independent sequences of bits. Such sources also proof a secure pseudorandom bit generator, which was
duce bits rather slowly (Schuster, 1988). Hence one proposed by Blum and Micali (1984) and Yao (1982).
would like to conserve the number of random bits The basic properties characterizing a secure pseudoran-
needed in a computation. Second, the deterministic dom bit generator are "randomness-increasing" and
character of pseudorandom bit sequences permits the "computationally unpredictable." Recently obtained re-
easy reproducibility of computations. A third reason sults are that if one of the following objects exists,
then they all exist:
arises from cryptography: the existence of secure pseu-
dorandom bit generators is essentially equivalent to 1. a secure pseudorandom bit generator;
the existence of secure private-key cryptosystems. 2. a one-way function; and
For Monte Carlo simulations, one often wants pseu- 3. a secure (block-type) private-key cryptosystem.
dorandom numbers, which are numbers simulating
either independent draws from a fixed probability dis- The central unsolved question is whether any of these
tribution on the real line R or more generally numbers objects exist. However, good candidates are known for
simulating samples from a stationary random process. one-way functions. A major difficulty in settling the
existence problem for this theory is summarized in the
following heuristic.
Jeffrey C. Lagarias is a Distinguished Member of Tech- UNPREDICTABILITY PARADOX. If a deterministic

nical Staff, Math Sciences Research Center, AT&T Bell function is unpredictable, then it is difficult to prove
Laboratories, 600 Mountain Avenue, Murray Hil4 New anything about it; in particular, it is difficult to prove
Jersey 07974. that it is unpredictable.
31
32 J. C. LAGARIAS
Section 4 surveys results on the statistical perfor- in T( IA I) steps and outputs A has length greater than
mance of various known pseudorandom number gener- IA l. This notion of incompressibility is effectively com-
ators. In particular, a pseudorandom bit generator putable. One can similarly define incompressibility of
called the RSA generator produces secure pseudoran- ensembles of strings S with respect to time complexity
dom bits (in the sense of Section 2) if the problem of classes such as PTIME. In fact, the accepting function
factoring certain integers is computationally difficult. for membership in a "hard" set in a time-complexity
A good general reference for pseudorandom number class 3 behaves "randomly" when viewed as input to a
generators is Knuth (1981, Chapter 3). Turing machine that has a more restricted amount of
time available. This idea goes back at least to Meyer
and McCreight (1971).
2. EXPLICIT CONSTRUCTIONS OF PSEUDORANDOM
The three principles all involve some notion of func-
BIT GENERATORS
tional composition or of iteration of a function. Indeed,
Where might pseudorandom number generators running a computation of a Turing machine can be
come from? There are several general principles useful viewed as a kind of functional composition with feed-
in constructing deterministic sequences exhibiting back. The first two principles directly lead to can-
quasi-random behavior: expansiveness, nonlinearity didates for number-theoretic pseudorandom number
and computational complexity. generators (see below). The computational complexity
Expansiveness is a concept arising from dynamical principle leads to pseudorandom number generators
systems. A flow ft on a metric space is expansive (or having provably good properties with respect to some
hyperbolic) at a point x E M if nearby trajectories level of computational resources, but most of these are
diverge (locally) at an exponential rate from each other, not of use in practice because one must use more than
that is, Ift(x) - ft(y)I >- IIx - yI, where A > 1, providedthe allowed bound on computation to find them in the
Ix - yI < c, and for 0 c t < to. Such flows exhibit sensi- first place. Finally, we note that there are very close
tive dependence on initial conditions and are numeri- relations between expansiveness properties of map-
cally unstable to compute. A discrete version of this pings and the Kolmogorov-complexity of descriptions
phenomenon for maps on the interval I = [0, 1] is a of their trajectories (Brudno, 1982).
map T:I I such that IT'(x)I > 1 for almost all x E A variety of functions have been proposed for use in
[0, 1]; for example, T(x) = Ox (mod 1), where 0 > 1. In pseudorandom bit generators. We give several exam-
particular, it requires knowledge of at least the first ples below. Most of these functions are number-theo-
O(n) bits of T W>(x) to determine if x E [0, 1/2) or x E retic, and nearly all of them have an underlying group
[1/2, 1) as n -E xo. structure.
Nonlinearity provides a second source of determin-
EXAMPLE 1 (Multiplicative congruential generator).
istic randomness. Functional compositions of linear
The generator is defined by
maps are themselves linear, but nonlinear polynomial
maps, when composed, can yield exponentially growing Xn+1--aXn + b (mod M),
nonlinearity; for example, if f(x) = x2 and f(')(x) =
where 0C Xn M - 1. Here (a, b, M) are the parame-
f ( ri-1(x)), then f(n)(X) = X2n. Even the simple nonlinear
ters describing the generator, and xo is the seed. This
map f(x) = ax(l - x) for a e [0, 41, which maps [0, 11was one of the first proposed pseudorandom number
into [0, 1], exhibits extremely complicated dynamics
generators. Generators of this form are widely used in
under iteration (Collet and Eckmann, 1980).
practice in Monte Carlo methods, taking xJIM to sim-
Computational complexity is a third source of de-
ulte uniform draws on [0, 1] (Knuth, 1981; Marsaglia,
terministic randomness. Kolmogorov (1965), Chaitin
1968; Rubinstein, 1982). More generally, one can con-
(1966) and Martin-Lof (1966) defined a finite string
sider polynomial recurrences (mod M).
A = a, ... ak of bits to be Kolmogorov-random if the
length of the shortest input to a fixed universal Turing EXAMPLE 2 (Power generator). The generator is de-
machine 3 that halts and outputs A is of length greater fined by
than or equal to k - co for a constant cO. This notion of
Xn+1 - (xn) (mod N).
randomness is that of computational incompressibility.
Unfortunately, it is an undecidable problem to tell Here (d, N) are parameters describing the generator,
whether a given string A is Kolmogorov-random. How- and xo is the seed.
ever, one obtains weaker notions of computational An important special case of the power generator
incompressibility by restricting the amount of compu- occurs when N = PlP2 is a product of two distinct odd
tation allowed in attempting to compress the string primes. The first case occurs when (d, p,(N)) = 1, where
A. For example, given a nondecreasing time-counting
(N) = #(ZINZ)* = (pi - l)(p2 - 1)
function T such as T(x) = 5x3, a string A is
T-incompressible if the shortest input to 5 that halts is Euler's totient function that counts the number of
PSEUDORANDOM NUMBERS 33
multiplicatively invertible elements (mod N). Then the Zn= H(xn).

map x -- xd (mod N)-, is one-to-one on (ZINZ)*, and
The traditional use of hash functions is for data
this operation is the encryption operation of the RSA
compression, in which case 1 is taken much smaller
public-key cryptosystem (Rivest, Shamir and Adle-
than k. Sets of good hash functions play an important
man, 1978), where (d, N) are publicly known. We call
role in the theory of pseudorandom generators, where
this special case an RSA generator.
they are used to convert highly nonuniform probability
EXAMPLE 3 (Discrete exponential generator). The distributions on k inputs into nearly uniform distribu-
generator is defined by tions on 1 inputs; see, for instance, Goldreich, Kraw-
czyk and Luby (1988).
Xn+1 3 gxn (mod N).
A very special case of this construction is the trunca-
Here (g, N) are parameters describing the generator, tion operator
and xo is the seed.
Tj,l(x) = [21jxJ (mod 21),
A special case of importance occurs when N is an
odd prime p and g is a primitive root (mod p). Then which extracts from a k-bit string x the substring of 1
the problem of recovering xn given (Xn+ , g, p) is the bits starting j bits from its right end. In its most
discrete logarithm problem, an apparently hard num- extreme form, T(x) _ x (mod 2) extracts the least sig-
ber-theoretic problem (Odlyzko, 1985). The discrete ex- nificant bit. The truncation operation was originally
ponentiation operation (mod p) was suggested for suggested by Knuth to be applied to linear congru-
cryptographic use in the key exchange scheme of Diffie ential generators to cut off half their bits (saving the
and Hellman (1976). A key exchange scheme is a high-order bits) as a way of increasing apparent ran-
method for two parties to agree on a secret key using domness. This idea was used in a number of schemes
an insecure channel. Blum and Micali (1984) gave a for encrypting passwords to computer systems. How-
method for generating a set of pseudorandom bits ever, it is insecure. Reeds (1979) successfully cryptana-
whose security rests on the difficulty of solving the lyzed one such system.
discrete logarithm problem.
CONSTRUCTION 2 (Composition). Let {xn4 and {yn4 be
EXAMPLE 4 (Kneading map). Consider a bivariate sequences of k-bit strings, let *: {O, 1}' X {O, 1}k
transformation {O, 1}1 be a binary operation, and set
(Xn+i ,Yn+i) (Yn, Xn + f (yn, Zn)), Zn = Xn * Yn.
where f is a fixed bivariate function, usually taken to The simplest case occurs with k = 1, where * is the
be nonlinear. The function f(,.) determines the genera- XOR operation
tor, and (xo, yo) and the family {zn constitute the seed.
Znx= xn + Yn (mod 2),
One often takesZn z= K for all integers n, for a fixed
K. This mapping has the feature that it is one-to-one, where Zn, xn, and Yn are viewed as k-dimensional vectors
with inverse over GF(2). More generally, whenever the operation
table of * on {O, 1}k X {O, l}k forms a Latin square (but
(Xn, Yn) (Yn+- f(Xn+i , Zn), Xn+). is not necessarily either commutative or associative),
One can generalize this construction to take x, y and

then the * operation has certain randomness-increasing
f(.,.) to be vector-valued. The data encryption standard properties when the elements {xn} and {yn} are indepen-
(DES) cryptosystem is composed of 16 iterations of
dent draws from fixed distributions Q and Q on {O, l}k
(Marsaglia, 1985).
(vector-valued) maps of this type, where (xo, yo) are the
There is a constant search for new and better prac-
data to be encrypted, all zi = K compose the key, and
f is a specific nonlinear function representable as a
tical pseudorandom number generators. Recently,
polynomial in several variables.
Marsaglia and Zaman (1991) proposed some simple
Other examples include the 1/P-generator and the
generators that have extremely long periods; whether
square generator studied in Blum, Blum and Shub they have good statistical properties remains to be
determined.
(1986) and cellular automata generators proposed by
Wolfram (1986).
In addition to these examples, more complicated 3. COMPUTATIONAL INFORMATION THEORY
pseudorandom number generators can be built out of AND CRYPTOGRAPHY
them using the following two mixing constructions.
In the last few years, there has been extensive develop-
CONSTRUCTION 1 (Hashing). If {xn} are binary strings ment of a theoretical basis of secure pseudorandom
of k bits and H: {f,}kl {O, 1}' is a fixed function, called number generators. This area is called computational
in this context a hash function, then define {z'} by information theory.
34 J. C. LAGARIAS
Computational information theory represents a The fundamental principle is that a definition of

blending of information theory and computational pseudorandom numbers is always relative to the use
complexity theory (Yao, 1988). The basic notion taken to which the pseudorandom numbers are to be put. This
from information theory is entropy, a measure of infor- use is to simulate the target probability distribution to
mation. Here, "information" equals "randomness." In- within a specified degree of approximation.
formation theory was developed by Shannon in the We measure the degree of approximation using sta-
1940s, and a relationship between computational com- tistical tests. Let P be the source probability distribu-
plexity and information theory was first observed by tion, G(P) the probability distribution generated by the
Kolmogorov. Both Shannon and Kolmogorov assumed pseudorandom number generator G and Q the target
in defining "amount of information" that unbounded probability distribution to be approximated by G(P). A
computational resources were available to extract that statistic is any deterministic function a(x) of a sample x
information from the data. In contrast, computational drawn from a distribution, and a statistical test a con-
information theory assumes that the amount of compu- sists of computation of a statistic a drawn from G(P).
tational resources is bounded by a polynomial in the We say that distributions Pi and P2 on the same sample
size of the input data. space 8 are c-indistinguishable using the statistic a
Here we indicate the ideas behind the theory, assum- on 8 provided that the expected values of a(x) drawn
ing some familiarity with the basic concepts of com- from Pi and P2, respectively, agree to within the toler-
putational complexity theory for uniform models of ance level E; that is,
computation (Turing machines) and for nonuniform
models of computation (Boolean circuits). Turing ma-
IE[u(x):xePi] -E[a(x):xeP2] .
chines and polynomial-time computability are dis- Given a collection 3 = {(ai, ei)} of statistical tests ai
cussed in Garey and Johnson (1979), and circuit with corresponding tolerance levels ci, we say that G
complexity is described in Savage (1976) and Karp and is a 3-pseudorandom number generator from source P
Lipton (1982). to target Q provided that G(P) is ei-indistinguishable
What is a pseudorandom number generator? The from Q for all the statistical tests a1 drawn from 3.
basic idea of pseudorandomness is to take a small To make these ideas clearer, we allow as source distri-
amount of randomness described by a seed drawn from bution the uniform distribution Uk on the set {O, 1}k of
a given source probability distribution and to produce binary strings of length k, and for target distributions
deterministically from the seed a larger number of we allow either Ue for some e, or else U([O, 119), which
apparently random bits that simulate a sample drawn is the distribution of Q independent draws from the
from another target probability distribution on a range uniform distribution on [0, 11.
space. That is, it is randomness-increasing. As an example, consider the linear congruential gen-
The amount of randomness in a probability distribu- eratorxn+1 axn + b(modM)withM= 2k - 1, where
tion is measured by its-binary entropy (or information), the seed xO is drawn from 0 c xo c 2 k - 1 and is
which for a discrete probability distribution P is viewed as an element of {O,i}k drawn with distribution
Uk. We use this generator to produce a vector of Q
H(P) = Pp(x) log2p(x),
x iterates, G(xo) = (xlIM, x21M, . .. , xlM) and view G(Uk)
as approximating the distribution U([O, ly). Various
where x runs over the atoms of P. In particular,
statistics that one might consider for y = (yl, . .. , y)
H(Uk) = k; in the sample space 8 = [0, 1Je are
that is, "k coin ffips give k bits of information." The

notion of randomness-increasing is impossible in classi- 1. A verage: aA(Y) = ei=1 yi
cal information theory because any deterministic map- 2. Maximum: aM(y) = max{yi: 1 c i c Q}
ping G applied to a discrete probability distribution P 3. Maximum run: aR(y) = max{j: yi < yi+i
never increases entropy; that is,
< . . . < yj+j
H(G(P)) < H(P). for some i}
However, this may be possible when computing power 4. mth autocorrelation: am(y) = 1 _ lf-n-YiYi+m.
is limited. Indeed, what may happen is that G(P) may
approximate a target distribution Q having a much It is now a purely mathematical problem to decide
higher entropy so well that, within the limits of com- what e-tolerance level can be achieved for each of these
puting power available, one cannot tell the distribu- statistics, viewing G(Uk) as approximating U([O, i]l).
tions G(P) and Q apart. If H(Q) is much larger than Various results for such statistics for linear congru-
H(P), then we can say that G is computationally ential generators can be found in Niederreiter (1978).
rand omness-increasing. The statististical tests applied to pseudorandom num-
ber generators for use in Monte Carlo simulations are pseudorandom bit generator (U-PRG) in which PSIZE
generally simple; for- cryptographic applications, in Condition 2 is replaced by PTIME, where a family
psuedorandom number generators must pass a far T = {Ckl of circuits is in PTIME if there is a polyno-
more stringent battery of statistical tests. mial-time algorithm for computing Ck, given lk as in-
Now we are ready to give a precise definition of a put. This condition may be rephrased as, "' passes all
secure pseudorandom bit generator. To achieve insensi- polynomial time statistical tests." Since PTIME c
tivity to the computational model, this definition, due PSIZE, an N-PRG is always a U-PRG.
to Yao (1982), is an asymptotic one. Instead of a single These definitions of N-PRG and U-PRG form the
generator G of fixed size, we consider an ensemble basis of a nice theory, whose purpose is to reduce
S = {Gk: k 2 1} of generators with the existence question for apparently very complicated
objects (N-PRGs) to the existence question for simpler,
Gk {O, 1 }k* {O, 1 }f(k)
more plausible objects. As an example, we have:
that have the property of being polynomial length-
THEOREM 1. If there exists an N-PRG with f(x) =
increasing; that is
k + 1, then there exists an N-PRG with f(k) = ktm + m
k + 1 c f(k) c ktm + m for each m 2 2. (Similarly for U-PRGs.)
for some fixed integer m > 1. We say that S is a That is, if there is an PRG that can inflate k random
nonuniform secure pseudorandom bit generator (N- bits to k + 1 pseudorandom bits, this is already suffi-
PRG) if: cient to construct a PRG that inflates k random bits
to any polynomial number of pseudorandom bits. A
1. there is a (deterministic) polynomial-time algo-
proof appears in Boppana and Hirschfeld (1989).
rithm that computes all Gk(x), given (k, x) as input
One disadvantage of these definitions (N-PRG, etc.)
with x E {O, 1}k, and
at present is that no such generators have been proved
2. for any PSIZE family Tf of Boolean circuit statis-
to exist. In fact, proving that they exist is at least as
tical tests, Gk(Uk) is polynomially indistinguish-
hard a problem as settling the famous P * NP ques-
able by Tf from Uf(k); that is, for all m > 1, the
tion of theoretical computer science. On the positive
distribution Gk(Uk) is k-m-indistinguishable by Tf
side, however, Theorem 4 below suggests that the RSA
from Uf(k) for all sufficiently large k.
generator may well be a U-PRG.
Condition 1 says that S is easy to compute. Condition Yao (1982) provided the basic insight that the nature
2 requires more explanation. A Boolean circuit Ce is of secure pseudorandomness (N-PRG) is the computa-
made of and, or and not gates having e inputs and tional unpredictability of successive bits produced by
computes a Boolean function Ce: {O, 1}t -- {O, 1}. A fami- a pseudorandom bit generator. The notion of computa-
ly ii = {Ck : k > 1} of Boolean circuits is of polynomial tional unpredictability is captured by the concept of a
size (abbreviated Tf E PSIZE) if there is some fixed m nonuniform next-bit test, as follows. Let S = {Gk} be a
such that the number of gates in Ck is ktm + m, for collection of mappings Gk : {O, l}k -- {o, l}f(k), where f(k)
all m. The Boolean function Ce(x) computed by Ce with is polynomial in k. A predicting collection is a doubly
e = f(k) provides a statistical test for comparing Gk(Uk) indexed family of Boolean circuits (e = {Ci,k: 1 C i <
and Ut. Then Condition 2 asserts that f(k) - 1, k > 1} computing functions 64ik {O, l}i -
{O, 1}, where the circuit Cik is viewed as predicting the
E[Ck(x): x E Gk(Uk)]- E[Ck(x) : x E Uf(k)] <k-m (i + 1)st bit of a sequence in {O, l}f(k) given the first i
bits. The predicting collection is polynomial size if
for all k > co(g, m). there is some m > 2 such that each CQk has at most
ktm + m gates. Let pEk denote the probability that the
The PSIZE measure of computational complexity is output of 64k applied to a bit sequence (xi, . . . , xi)
nonuniform in that no relation between the circuits Ck drawn from Gk(Uk) iS xi+l. We say that S passes the
for different k is required. In particular, the family Tf nonuniform next-bit test if for each polynomial-size
might be noncomputable (nonrecursive). Condition 2 predicting collection (e and each m > 2, for all suffi-
may be colloquially rephrased as, "' passes all polyno- ciently large k,
mial size statistical tests."
This definition implies that N-PRGs must have a one-
2 km+m
c 1 ~~1
way property: Given the output y = Gk(x) E {O, l}f(k),
one cannot recover the string x E {O, l}k using a PSIZE
holds for 1 c i c f(k) - 1. This test says that xi+1
family of circuits for a fraction of the inputs exceeding
cannot be accurately guessed using PSIZE circuit fami-
1I(km + m) for any fixed m. Otherwise, it would fail the
lies with xi, . .. , xi as input.
statistical test, "Is there a seed x such that Gk(x) = y?"
We also consider the weaker notion of uniform secure THEOREM 2 (Yao, 1982). The following are equivalent:
36 J. C. LAGARIAS
1. A collection S = {Gk} passes the nonuniform 2. There exists a nonuniform one-way collection {ff}
next-bit test. of functions.
2. A collection S = {Gkj passes all polynomial-size
The implication (1) > (2) is easy, because an
statistical tests.
N-PRG 9 = {Gk} is a one-way collection. To pro
In this theorem, there is no restriction on the sizes (1), it suffices by Theorem 1 to enlarge a set of erandom
of circuits needed to compute the functions Gk; for bits to obtain e + 1 pseudorandom bits, for an infinite
example, they could be of exponential size in k. The set of f. We may reduce to the case that the function
direction (2) > (1) is fairly easy to prove because the fk is length-preserving by padding its input bits if
next-bit test is essentially a family of PSIZE statistical necessary. The first key idea, due to Goldreich and
tests. The direction (1) > (2) is harder. A proof of this Levin (1989), is that the Boolean inner product
result is given in Boppana and Hirschfeld (1989). k
B(x, r) = xiri (mod 2)

COROLLARY 1. A polynomial-time computable collec- i=l
tion S = {Gk} that passes the nonuniform next-bit test

is a hard-core predicate for all length-preserving one-
is an N-PRG.
way families f: {O, 1 }k ->{o, 1 }k. That is, the probabflity
distribution of the (2k + 1)-bit strings (f(x), r, B(x, r))
Pseudorandom bit generators must have a very
for (x, r) drawn uniformly from {O, 1}2k is polynomially
strong one-way property: from the output of such a
indistinguishable from (f(x), r, 1), where 15 is a truly
generator, it must be computationally infeasible to
random bit drawn independent of x and r. We are now
extract any information at all about its input, for
essentially all inputs. One-way functions are required
done if f is a permutation, because then (f(x), r,,1)
would be uniformly distributed on 2k + 1 bits. The
only to satisfy a weaker one-way property: they are
difficult case occurs when f is not one-to-one, in which
easy to compute but difficult to invert on a nonnegligi-
case the set Sx = {y: f(y) = f(x)} may be large. In
ble fraction of their instances.
A formal definition of one-way function is as follows: this case, f(x) contains only about k - log2 ISxI bits o
information, and so the entropy of (f(x), r, /) may be
a nonuniform one-way collection {ff} consists of a family
smaller than 2k, and we gain no computational entropy.
of functions fk : {O, lIk v. {f, 1}p(), where k < p(k) <
The key idea of Impagliazzo, Levin and Luby (1989)
k'n + m for some fixed m, such that:
is to add in log2ISxl extra bits of information from x
1. the ensemble {ff} is uniformly polynomial-time by hashing x with a random hash function h: {O, l}k '
computable, and {O, l}Isxl. If one can do this, then the distribution of (f(x),
2. for each PSIZE circuit family {Ck} with Ck r, h, h(x), B(x, r)) will be polynomially indistinguishable
{O, l}p(k) -. {O, l}k, the functions {ff} are hard
from to(f(x), r, h, h(x), 15), and one extra pseudorandom
invert on a nonnqegligible fraction 1/(kc + bit
c) of
will have been created. The nonuniform model of
their inputs, where c is a fixed positive constant computation is used in determining the number log2 ISx
depending on {Ck}. That is, for x drawn from the of bits to hash in the hash function for x.
uniform distribution on {O, } k, Hastad (1990) proved that a uniform version of Theo-
rem 3 holds; that is, one may replace "nonuniform" by
P [fk(X) = fk (Ck (fk(x)))] "uniform" in (1) and (2).
is at most 1 - (kc + c)-1 for all large enough k Finally, secure PRGs can be used to construct secure
(depending on {Ck}). private-key cryptosystems and vice versa. A private-
key cryptosystem uses a (private) key exchanged by
The one-way property embodied in this definition is some secure means between two users, the possession
weaker than that required of an N-PRG in several of which enables them both to encrypt and decrypt
ways: a function fk can have an extremely nonuniform messages sent between them. Encrypted messages
probability distribution on its range {o, l}P(k), it is re- should be unreadable to anyone else. They should ap-
quired to be hard to invert only on a possibly small pear random to any unauthorized receiver, and ideally
fraction 1I(kc + c) of its inputs, and even on those no statistical information at all should be extractable
inputs where it is hard to invert, some partial informa- from the encrypted message. This is seldom achieved
tion about its inverse may be easy to obtain. in actual cryptosystems, and statistical methods are
Impagliazzo, Levin and Luby (1989) showed that one one of the staples of cryptanalysis. Absolute security
can bootstrap a nonuniform one-way collection into an can always be achieved by using a key of the same
N-PRG. length as the totality of encrypted messages to be
exchanged (the "one-time pad"). How much security
THEOREM 3. The following are equivalent:
is possible when the key is to be shorter than the
1. There exists a nonuniform polynomial pseudoran- messages to be encrypted? An analogy between
dom bit generator (N-PRG). pseudorandom number generation and private-key
cryptosystems is apparent: the key supplies absolutely E(y) ye:= x (mod N).
random bits from which pseudorandom bits (the en-
Suppose one is given a 0 - 1-valued oracle function
crypted message) are created. This analogy can be
O(x) for which
made precise. For a statement of the result, see Laga-
rias (1990, Section 6); some related results appear in O(x) - y (mod 2)
Impagliazzo and Luby (1989).
Rompel (1990) showed how to construct a secure holds for d + 1I(km + m)) N of all inputs x E [O, N - 11,
authentication scheme based on a one-way function. where k = log2 N. Then there is a probabilistic polyno-
Finally, we note that probabilistic algorithms play mial time algorithm that makes 0(kcom) oracle calls,
a fundamental role in the theoretical foundations of uses 0(kc0m) "random" bits, halts in 0(kc0m) steps and
modern cryptography, as indicated in Goldwasser and for any x finds y with probability > 1 - 1/N, where
Micali (1984). the probability is taken over all sequences of "random"
bits.
4. PERFORMANCE OF CERTAIN PSEUDORANDOM This algorithm involves several clever ideas. The
BIT GENERATORS first of these is due to Ben-Or, Chor and Shamir (1983),
We now turn to results on the performance of various using the fact that the encryption function E(*) re-
generators; see Knuth (1981) for general background. spects the group law on (ZINZ)*; that is,
Linear congruential pseudorandom number genera-
E(ay) = E(a)E(y).
tors have been extensively studied, and many of their
statistical properties carefully analyzed. With proper They suppose that one is given an oracle 0 that
choice of parameters, they can produce sequences {Xk} perfectly recognizes membership in an interval I =
such that {xk(mod M): 1 c k c n } is close to uniformly [-5N12, JNI2] for fixed positive a c 1/2; that is,
distributed on [1, Ml. Marsaglia (1968) was the first to
notice that these generators have certain undesirable o,x)=f', ify EI,
correlation properties among several consecutive iter-
, ifyCI.
ates (xn, Xn+l, ... ., xn+k). Extensive analysis is given in
Niederreiter (1978). These generators are apparently Define [xJN to be the least nonnegative residue (mod
unsuitable for cryptographic use (Frieze et al., 1988). N), and define absN(x) to be the distance of the least
The group-theoretic structure of certain other gen- (positive or negative) residue from 0 so that
erators can be used to prove "almost-everywhere
hardness" results for such generators, including the
following one. abs,v(x) = fxI O x<NI2,
U- [x1Nv N12 < x < N.
RSA BIT GENERATOR. Given k 2 2 and m 2 1, select
odd primes pi and P2 uniformly from the range 2k <Let parN(x) be the parity of absN(x). The clever idea is
pi < 2k+1, and form N = PlP2. Select e uniformly from
to generate "random" a, b and attempt to compute the
[1, N1 subject to (e, (N)p) = 1. Set gcd of [ayJv and [bylv using a modified binary gcd
algorithm due to Brent and Kung (1983) and Purdy
Xn+1 -(Xn)e (mod N),
(1983). This algorithm computes the greatest common
and let the bit Zn+i be given by divisor (r, s) by noting that it is equal to 2 (rl2, s12) if
r, s are both even, to (rl2, s) or (r, s12) if one is even and
Zn+1 Xn+1 (mod 2).
to '/2(r + s), '/2(r - s) otherwise. It is easy to check that
Then {Zn: 1 _ n ktm + m} are the pseudorandom bits max(QrI, Is I) is nonincreasing at each iteration and that
generated from the seed xO of length 2k bits. it decreases by at least a factor of 3/4 every two
iterations, so this algorithm halts in at most 2 log4/3 N
We consider the problem of predicting the next bit
iterations. The Ben-Or, Chor and Shamir algorithm
Zn+1 of the RSA generator, given {z1, . . ., Zn}. Alexi et draws pairs (a, b) of "random" numbers uniformly on
al. (1988) showed that getting any information at all
[0, N]. Such a pair will be called "good" (for the unknown
about Zn+l is as hard as the (apparently difficult) prob-
y) if [ayJN and [by] Nboth fall in I. For a good pair the
lem of factoring N. Their argument shows how an
oracle 0 can be used to determine correctly the parity
oracle that guesses the next bit with a probability
of [ay]N and [bylN, even though y is unknown. We use
slightly greater than 1/2 can be used in an explicit
the fact that if w E I is even, then w12 E I, whereas if
fashion to construct an inversion algorithm.
it is odd, then w12 E [N12 - I1/2,N12 + 111/21. Hence
THEOREM 4. Let (e, N) be a fixed pair, where for a good pair, one has (a/2)y E I if [ayJN is even and
(e, (N)) = 1 and N = PlP2 is odd, with the associated (a/2)y ? I if [ay]N is odd. This holds similarly for [bylv.
power generator Thus using the group law
38 J. C. LAGARIAS
REFERENCES
parN([ayIN) = 1 - O(E[(y)y])
ALEXI, W., CHOR, B., GOLDREICH, 0. and SCHNORR, C. P. (1988).
RSA and Rabin functions: Certain parts are as hard as the
whole. SIAMJ. Comput. 17 194-209.
= 1 -0 i(E ]x)
BEN-OR, M., CHOR, B. and SHAMIR, A. (1983). On the crypto-
graphic security of single RSA bits. In Proceedings of the
is calculable. Consequently, one can compute one step 15th Annual Symposium on Theory of Computing 421-430
ACM Press, New York.
of the modified binary gcd algorithm to ([ayJN, [byIN)
BLUM, L., BLUM, M. and SHUB, M. (1986). A simple unpredictable
and obtain new values ([a'yJN, [b'y]N), with (a', b') pseudorandom number operator. SIAMJ. Comput. 15 364-
known. Now (a', b') is still a good pair, so we can 383.
continue to follow the modified binary gcd algorithm BLUM, M. and MICALI, S. (1984). How to generate cryptographi-
perfectly and eventually determine [1yJN = gcd ([ay]N, cally strong versions of pseudorandom bits. SIAM J. Com-
put. 13 850-864.
[byIN), where 1 is known. We say that the algorithm
BOPPANA, R. and HIRSCHFELD, R. (1989). Pseudorandom genera-
succeeds if [1yIN = 1, because in that case y i'- 1 (mod tors and complexity classes. In Randomness and Computa-
N) is found. We verify success in this case by checking tion (S. Micali, ed.) 1-26. JAI Press, Greenwich, CT.
that E[1-1' = x. In all other cases, when (a, b) is not BRENT, R. P. and KUNG, H. T. (1983). Systolic VLSI arrays for
good or when it is good and the gcd is not 1, we carry linear time gcd computations. In VLSI 83, IFIP (F. Anceau
and D. J. Aas, eds.) 145-154. North-Holland, Amsterdam.
out the above procedure for 2 1og4,3 N iterations and
BRUDNO, A. A. (1982). Entropy and the complexity of the trajecto-
check the final iterate 1 to see whether E[1-11 = x. ries of a dynamical system. Trans. Moscow Math. Soc. 44
This procedure is a probabilistic polynomial-time al- 127-151.
gorithm. The pair (a, b) is good with probability at least CHAITIN, G. J. (1966). On the length of programs for computing
52. Also, because the distributions of [ayIN, E[bYN are finite binary sequences. J. Assoc. Comput. Mach. 13 547-
569.
uniform on [- 5N12, 5N121, and because the probability
COLLET, P. and ECKMANN, J.-P. (1980). Iterated Maps on the
that gcd(r, s) = 1 for such draws approaches ir2/6 as
Interval as Dynamical Systems. Birkhauser, Boston.
III0 oo, it exceeds a positive constant a for all values DEVROYE, L. (1986). Nonuniform Random Variate Generation.
of {III}. Thus the success probability for any draw (a, b) Springer, New York.
is 2 62a, and the expected time until the algorithm DIFFIE, W. and HELLMAN, M. (1976). New directions in cryptogra-
phy. IEEE Trans. Inform. Theory 22 644-654.
succeeds is polynomial in log N.
FEIGENBAUM, J. (1993). Probabilistic algorithms for defeating
One immediately sees that this algorithm still works adversaries. Statist. Sci. 8 26-30.
if the oracle 0I is not perfect but is accurate with FRIEZE, A., HASTAD, J., KANNAN, R., LAGARIAS, J. C. and
probability of error less than (4 log413N)-l. SHAMIR, A. (1988). Reconstructing truncated integer vari-
Subsequent work of several authors showed how an ables satisfying linear congruences. SIAM J. Comput. 17
262-280.
oracle having a 1/2 + 1/(log N)k advantage in guessing
GAREY, M. and JOHNSON, D. (1979). Computers and Intractability:
parity of the smallest RSA bit can be used to simulate
A Guide to the Theory of NP-Completeness. W. H. Freeman,
a much more accurate oracle and then to simulate an San Francisco.
oracle such as OI above with sufficient accuracy to GOLDREICH, O., KRAWCZYK, H. and LUBY, M. (1988). On the
obtain Theorem 4 (Alexi et al., 1988). existence of pseudorandom generators. In Proceedings of
the 29th Annual Symposium on Foundations of Computer
Theorem 4 shows that if the factoring problem is
Science 12-24. IEEE Computer Society Press, Los Alamitos,
difficult, then recovering any information at all about CA.
Zn+1 given {z1, ... ., Zn is hard. In particular, the bits GOLDREICH, 0. and LEVIN, L. (1989). A hard-core predicate for
{Zk} would then be computationally indistinguishable all one-way functions. In Proceedings of the 21st Annual
from i.i.d. coin ffips. Symposium on Theory of Computing 25-32. ACM Press,
New York.
One can extract several pseudorandom bits from
GOLDWASSER, S. and MICALI, S. (1984). Probabilistic encryption.
each of the iterates xn of the RSA generator. The J. Comput. System Sci. 28 270-299.
strongest result to date on this problem can be found HXSTAD, J. (1990). Pseudo-random generators under uniform
in Schrift and Shamir (1990). assumptions. In Proceedings of the 22nd Annual Symposium
The use of the group structure on (ZINZ)* to "ran- on Theory of Computing 395-404. ACM Press, New York.
IMPAGLIAZZO, R., LEVIN, L. and LUBY, M. (1989). Pseudorandom
domize" is an example of the concept of an instance-
number generation from one-way functions. In Proceedings
hiding scheme discussed by Feigenbaum in this issue. of the 21st Annual Symposium on Theory of Computing 12-
24. ACM Press, New York.
IMPAGLIAZZO, R. and LUBY, M. (1989). One-way functions are
essential for complexity-based cryptography. In Proceedings
ACKNOWLEDGMENT of the 30th Annual Symposium on Foundations of Computer
Science 230-235. IEEE Computer Society Press, Los Ala-
This article is an abridged and revised version of mitos, CA.
Lagarias (1990), published by the American Mathemat- KARP, R. and LIPTON, R. (1982). Turing machines that take
ical Society. advice. Enseign. Math. 28 191-209.
KNUTH, D. (1981). The Art of Computer Programming 2 Seminu- PURDY, G. B. (1983). A carry-free algorithm for finding the great-
merical Algorithms 2nd. ed. Addison-Wesley, Reading, MA. est common divisor of two integers. Comput. Math. Appl. 9
KNUTH, D. and YAO, A. C. (1976). The complexity of nonuniform 311-316.
random number generation. In Algorithms and Complexity REEDS, J. (1979). Cracking a multiplicative congruential en-
(J. F. Traub, ed.) 357-428. Academic, New York. cryption algorithm. In Information Linkage between Applied
KOLMOGOROV, A. N. (1965). Three approaches to the concept of Mathematics and Industry (P. C. Wong, ed.) 467-472. Aca-
"the amount of information." Problems Inform. Transmission demic, New York.
1 3-13. RIVEST, R., SHAMIR, A. and ADLEMAN, L. (1978). On digital
LAGARIAS, J. C. (1990). Pseudorandom number generators in signatures and public key cryptosystems. Comm. ACM 21
number theory and cryptography. In Cryptology and Compu- 120-126.
tational Number Theory (C. Pomerance, ed.) 115-143. Amer. ROMPEL, J. (1990). One-way functions are necessary and sufficient
Math. Soc., Providence, RI. for secure signatures. In Proceedings of the 22nd Annual
MARSAGLIA, G. (1968). Random numbers fall mainly in the planes. Symposium on Theory of Computing 387-394. ACM Press,
Proc. Nat. Acad. Sci. USA. 61 25-28. New York.
MARSAGLIA, G. (1985). A current view of random number genera- RUBINSTEIN, R. Y. (1982). Simulation and the Monte Carlo
tors. In Computer Science and Statistics: Sixteenth Sympo- Method. Wiley, New York.
sium on the Interface (L. Billard, ed.) 3-11. North-Holland, SAVAGE, J. (1976). The Complexity of Computing. Wiley, New
Amsterdam. York.
MARSAGLIA, G. and ZAMAN, A. (1991). A new class of random SCHRIFT, A. and SHAMIR, A. (1990). The discrete log is very
number generators. Ann. Appl. Probab. 1 462-480. discreet. In Proceedings of the 22nd Annual Symposium on
MARTIN-LOF, P. (1966). The definition of random sequences. In- Theory of Computing 405-415. ACM Press, New York.
form. and Control 9 619-682. SCHUSTER, H. G. (1988). Deterministic Chaos. VCH Verlags-
MEYER, A.R. and MCCREIGHT, E. M. (1971). Computationally gesellschaft GmbH, Weinheim, Germany.
complex and pseudorandom zero-one valued functions. In WOLFRAM, S. (1986). Random sequence generation by cellular
Theory of Machines and Computations (Z. Kohlavi and A. automata. Adv. in Appl. Math. 7 123-169.
Paz, eds.) 19-42. Academic, New York. YAO, A. (1982). Theory and applications of trapdoor functions
NIEDERREITER, H. (1978). Quasi-Monte Carlo methods and pseu- (extended abstract). In Proceedings of the 23rd Annual Sym-
dorandom numbers. Bull. Amer. Math. Soc. 84 957-1041. posium on Foundations of Computer Science 80-91. IEEE
ODLYZKO, A. M. (1985). Discrete logarithms in finite fields and Computer Society Press, Los Alamitos, CA.
their cryptographic significance. In Advances in Cryptology: YAO, A. (1988). Computational information theory. In Complexity
Eurocrypt '84. Lecture Notes in Comput. Sci. 209 224-316. in Information Theory (Y. Abu-Mostafa, ed.) 1-15. Springer,
Springer, New York. New York.

Institute of Mathematical Statistics Statistical Science

Uploaded by

Copyright:

Available Formats

You might also like

Institute of Mathematical Statistics Statistical Science

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Institute of Mathematical Statistics Statistical Science

Uploaded by

Copyright:

Available Formats

Pseudorandom Numbers

Author(s): Jeffrey C. Lagarias

Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and

Abstract. This article surveys the problem of generating pseudorandom

Key words and phrases: Computational information theory, cryptogra-

1. INTRODUCTION It is possible to simulate samples of any reasonable

Jeffrey C. Lagarias is a Distinguished Member of Tech- UNPREDICTABILITY PARADOX. If a deterministic

multiplicatively invertible elements (mod N). Then the Zn= H(xn).

(Xn+i ,Yn+i) (Yn, Xn + f (yn, Zn)), Zn = Xn * Yn.

One can generalize this construction to take x, y and

Computational information theory represents a The fundamental principle is that a definition of

that is, "k coin ffips give k bits of information." The

H(G(P)) < H(P). for some i}

B(x, r) = xiri (mod 2)

tion S = {Gk} that passes the nonuniform next-bit test

You might also like