The.Algorithm.Design.Manual.Springer-Verlag.1998
The.Algorithm.Design.Manual.Springer-Verlag.1998 The.Algorithm.Design.Manual.Springer-Verlag.1998
Random Number Generation an Internet password scheme was recently invalidated with the discovery that its keys were produced using a random number generator of such small period that brute-force search quickly exhausted all possible passwords. The accuracy of simulations is regularly compromised or invalidated by poor random number generation. Bottom line: This is an area where people shouldn't mess around, but they do. Issues to think about include: ● Should my program use the same ``random'' numbers each time it runs? - A poker game that deals you the exact same hand each time you play quickly loses interest. One common solution is to use the lower-order bits of the machine clock as a seed or source for random numbers, so that each time the program runs it does something different. Such methods are perhaps adequate for games, but not for serious simulations. There are liable to be periodicities in the distribution of random numbers whenever calls are made in a loop. Also, debugging is seriously complicated by the fact that the results are not repeatable. If the program crashes, you cannot go back and discover why. One possible compromise is to use a deterministic pseudorandom number generator, but write the current seed to a file between runs. During debugging, this file can be overwritten with a fixed initial value or seed. ● How good is my compiler's built-in random number generator? - If you need uniformly generated random numbers, and you are not going to bet the farm on the accuracy of your simulation, my recommendation is simply to use what your compiler provides. Your best opportunity to mess it up is with a bad choice of starting seed, so read the manual for its recommendations. If you are going to bet the farm on the quality of your simulation, you had better test your random number generator. Be aware that it is very difficult to eyeball the results and decide whether the output is really random. This is because people have very skewed ideas of how random sources should behave and often see patterns that don't really exist. To evaluate a random number generator, several different tests should be used and the statistical significance of the results established. Such tests are implemented in plab and DIEHARD (discussed below) and explained in [Knu81]. ● What if I have to implement my own random number generator? - The algorithm of choice is the linear congruential generator. It is fast, simple, and (if instantiated with the right constants) gives reasonable pseudorandom numbers. The nth random number is a function of the (n-1)st random number: In theory, linear congruential generators work the same way roulette wheels do. The long path of the ball around and around the wheel (captured by ) ends in one of a relatively small number of bins, the choice of which is extremely sensitive to the length of the path (captured by the truncation of the ). A substantial theory has been developed to select the constants a, c, m, and . The period length file:///E|/BOOK/BOOK4/NODE142.HTM (2 of 5) [19/1/2003 1:30:27]
Random Number Generation is largely a function of the modulus m, which is typically constrained by the word length of the machine. A presumably safe choice for a 32-bit machine would be , a = 1366, c=150889, and m=714025. Don't get creative and change any of the constants, unless you use the theory or run tests to determine the quality of the resulting sequence. ● What if I don't want large, uniformly distributed random integers? - The linear congruential generator produces a uniformly distributed sequence of large integers, which can be scaled to produce other uniform distributions. For uniformly distributed real numbers between 0 and 1, use . Note that 1 cannot be realized this way, although 0 can. If you want uniformly distributed integers between l and h, use . Generating random numbers according to a given nonuniform distribution can be a tricky business. The most reliable way to do this correctly is the acceptance-rejection method. Suppose we bound the desired probability distribution function or geometric region to sample from a box and then select a random point p from the box. This point can be selected by p by generating the x and y coordinates independently, at random. If this p is within the distribution, or region, we can return p as selected at random. If p is in the portion of the box outside the region of interest, we throw it away and repeat with another random point. Essentially, we throw darts at random and report those that hit the target. This method is correct, but it can be slow. If the volume of the region of interest is small relative to that of the box, most of our darts will miss the target. Efficient generators for Gaussian and other special distributions are described in the references and implementations below. Be cautious about inventing your own technique, however, since it can be tricky to obtain the right probability distribution. For example, an incorrect way to select points uniformly from a circle of radius r would be to generate polar coordinates and select an angle from 0 to and a displacement between 0 and r, both uniformly at random. In such a scheme, half the generated points will lie within r/2 of the radius, when only one-fourth of them should be! This is a substantial enough difference to seriously skew the results, while being subtle enough that it might escape detection. ● How long should I run my Monte Carlo simulation to get the best results? - It makes sense that the longer you run a simulation, the more accurately the results will approximate the limiting distribution, thus increasing accuracy. However, this is only true until you exceed the period, or cycle length, of your random number generator. At that point, your sequence of random numbers repeats itself, and further runs generate no additional information. Check the period length of your generator before you jack up the length of your simulation. You are liable to be very surprised by what you learn. Implementations: An excellent WWW page on random number generation and stochastic simulation is available at http://random.mat.sbg.ac.at/others/. It includes pointers to papers and literally dozens of implementations of random number generators. From there are accessible pLab [Lee94] and DIEHARD, systems for testing the quality of random number generators. file:///E|/BOOK/BOOK4/NODE142.HTM (3 of 5) [19/1/2003 1:30:27]
- Page 349 and 350: Dictionaries use a function that ma
- Page 351 and 352: Dictionaries Implementation-oriente
- Page 353 and 354: Priority Queues ● Besides access
- Page 355 and 356: Priority Queues Fibonacci heaps [FT
- Page 357 and 358: Suffix Trees and Arrays Figure: A t
- Page 359 and 360: Suffix Trees and Arrays [GBY91]. Se
- Page 361 and 362: Graph Data Structures algorithms).
- Page 363 and 364: Graph Data Structures including the
- Page 365 and 366: Set Data Structures Next: Kd-Trees
- Page 367 and 368: Set Data Structures parent pointers
- Page 369 and 370: Kd-Trees Next: Numerical Problems U
- Page 371 and 372: Kd-Trees about p. Say we are lookin
- Page 373 and 374: Numerical Problems Next: Solving Li
- Page 375 and 376: Numerical Problems Mon Jun 2 23:33:
- Page 377 and 378: Solving Linear Equations algorithm
- Page 379 and 380: Solving Linear Equations Matrix inv
- Page 381 and 382: Bandwidth Reduction images near eac
- Page 383 and 384: Matrix Multiplication Next: Determi
- Page 385 and 386: Matrix Multiplication The linear al
- Page 387 and 388: Determinants and Permanents Next: C
- Page 389 and 390: Determinants and Permanents exposit
- Page 391 and 392: Constrained and Unconstrained Optim
- Page 393 and 394: Constrained and Unconstrained Optim
- Page 395 and 396: Linear Programming variable assignm
- Page 397 and 398: Linear Programming The book [MW93]
- Page 399: Random Number Generation Next: Fact
- Page 403 and 404: Random Number Generation Related Pr
- Page 405 and 406: Factoring and Primality Testing The
- Page 407 and 408: Factoring and Primality Testing Alg
- Page 409 and 410: Arbitrary-Precision Arithmetic If y
- Page 411 and 412: Arbitrary-Precision Arithmetic PARI
- Page 413 and 414: Knapsack Problem Next: Discrete Fou
- Page 415 and 416: Knapsack Problem is a subset of S'
- Page 417 and 418: Discrete Fourier Transform Next: Co
- Page 419 and 420: Discrete Fourier Transform an algor
- Page 421 and 422: Combinatorial Problems Next: Sortin
- Page 423 and 424: Sorting Next: Searching Up: Combina
- Page 425 and 426: Sorting The simplest approach to ex
- Page 427 and 428: Sorting operations, implying an sor
- Page 429 and 430: Searching large performance improve
- Page 431 and 432: Searching Notes: Mehlhorn and Tsaka
- Page 433 and 434: Median and Selection followed by fi
- Page 435 and 436: Generating Permutations Next: Gener
- Page 437 and 438: Generating Permutations The rank/un
- Page 439 and 440: Generating Permutations generating
- Page 441 and 442: Generating Subsets look right when
- Page 443 and 444: Generating Subsets above for detail
- Page 445 and 446: Generating Partitions Although the
- Page 447 and 448: Generating Partitions Two related c
- Page 449 and 450: Generating Graphs generate: ● Do
Random Number Generation<br />
an Internet password scheme was recently invalidated with the discovery that its keys were produced<br />
using a random number generator of such small period that brute-force search quickly exhausted all<br />
possible passwords. <strong>The</strong> accuracy of simulations is regularly compromised or invalidated by poor<br />
random number generation. Bottom line: This is an area where people shouldn't mess around, but they<br />
do. Issues to think about include:<br />
● Should my program use the same ``random'' numbers each time it runs? - A poker game that<br />
deals you the exact same hand each time you play quickly loses interest. One common solution is<br />
to use the lower-order bits of the machine clock as a seed or source for random numbers, so that<br />
each time the program runs it does something different.<br />
Such methods are perhaps adequate for games, but not for serious simulations. <strong>The</strong>re are liable to<br />
be periodicities in the distribution of random numbers whenever calls are made in a loop. Also,<br />
debugging is seriously complicated by the fact that the results are not repeatable. If the program<br />
crashes, you cannot go back and discover why. One possible compromise is to use a deterministic<br />
pseudorandom number generator, but write the current seed to a file between runs. During<br />
debugging, this file can be overwritten with a fixed initial value or seed.<br />
● How good is my compiler's built-in random number generator? - If you need uniformly generated<br />
random numbers, and you are not going to bet the farm on the accuracy of your simulation, my<br />
recommendation is simply to use what your compiler provides. Your best opportunity to mess it<br />
up is with a bad choice of starting seed, so read the manual for its recommendations.<br />
If you are going to bet the farm on the quality of your simulation, you had better test your random<br />
number generator. Be aware that it is very difficult to eyeball the results and decide whether the<br />
output is really random. This is because people have very skewed ideas of how random sources<br />
should behave and often see patterns that don't really exist. To evaluate a random number<br />
generator, several different tests should be used and the statistical significance of the results<br />
established. Such tests are implemented in plab and DIEHARD (discussed below) and explained<br />
in [Knu81].<br />
● What if I have to implement my own random number generator? - <strong>The</strong> algorithm of choice is the<br />
linear congruential generator. It is fast, simple, and (if instantiated with the right constants)<br />
gives reasonable pseudorandom numbers. <strong>The</strong> nth random number is a function of the (n-1)st<br />
random number:<br />
In theory, linear congruential generators work the same way roulette wheels do. <strong>The</strong> long path of<br />
the ball around and around the wheel (captured by ) ends in one of a relatively small<br />
number of bins, the choice of which is extremely sensitive to the length of the path (captured by<br />
the truncation of the ).<br />
A substantial theory has been developed to select the constants a, c, m, and . <strong>The</strong> period length<br />
file:///E|/BOOK/BOOK4/NODE142.HTM (2 of 5) [19/1/2003 1:30:27]