Prime Numbers
Prime Numbers Prime Numbers
9.2 Enhancements to modular arithmetic 449 Algorithm 9.2.5 (Montgomery product). This algorithm returns M(c, d) for integers 0 ≤ c, d < N, with N odd, and R =2 s >N. 1. [Montgomery mod function M] M(c, d) { x = cd; z = y/R; // From Theorem 9.2.1. 2. [Adjust result] if(z ≥ N) z = z − N; return z; } The [Adjust result] step in this algorithm always works because cd < RN by hypothesis. The only importance of the choice that R beapoweroftwois that fast arithmetic may be employed in the evaluation of z = y/R. Algorithm 9.2.6 (Montgomery powering). This algorithm returns x y mod N, for0 ≤ x0, andR chosen as in Algorithm 9.2.5. We denote by (y0,...,yD−1) the binary bits of y. 1. [Initialize] x =(xR) modN; // Via some divide/mod method. p = R mod N; // Via some divide/mod method. 2. [Power ladder] for(D − 1 ≥ j ≥ 0) { p = M(p, p); // Via Algorithm 9.2.5. if(yj == 1) p = M(p, x); } // Now p is x y . 3. [Final extraction of power] return M(p, 1); Later in this chapter we shall have more to say about general power ladders; the ladder here is exhibited primarily to show how one may call the M() function to advantage. The speed enhancements of an eventual powering routine all center on the M() function, in particular on the computation of z = y/R. Wehavenoted that to get z, two multiplies are required, as in equation (9.7). But the story does not end here; in fact, the complexity of the Montgomery mod operation can be brought (asymptotically, large N) down to that of one size-N multiply. (To state it another way, the composite operation M(x ∗ y) asymptotically requires two size-N multiplies, which can be thought of as one for the “∗” operation.) The details of the optimizations are intricate, involving various manifestations of the inner multiply loops of the M() function [Koç etal. 1996], [Bosselaers et al. 1994]. But these details stem at least in part from a wasted operation in equation (9.7): The right-shifting effectively destroys some of the bits generated by the two multiplies. We shall see this shifting phenomenon again in the next section. In actual program implementations of Montgomery’s scheme, one can assign a word-size base B =2 b ,sothat
450 Chapter 9 FAST ALGORITHMS FOR LARGE-INTEGER ARITHMETIC a convenient value R = B k may be used, whence the z value in Algorithm 9.2.5 can be obtained by looping k times and doing arithmetic (mod B) that is particularly convenient for the machine. Explicit word-oriented loops that achieve the optimal asymptotic complexity are laid out nicely in [Menezes et al. 1997]. 9.2.2 Newton methods We have seen in Section 9.1 that the div operation may be effected via additions, subtractions, and bit-shifts, although, as we have also seen, the algorithm can be bested by moving away from the binary paradigm into the domain of general base representations. Then we saw that the technique of Montgomery mod gives us an asymptotically efficient means for powering with respect to a fixed modulus. It is interesting, perhaps at first surprising, that general div and mod may be effected via multiplications alone; that is, even the small div operations attendant to optimized div methods are obviated, as are the special precomputations of the Montgomery method. One approach to such a general div and mod scheme is to realize that the classical Newton method for solving equations may be applied to the problem of reciprocation. Let us start with reciprocation in the domain of real numbers. If one is to solve f(x) = 0, one proceeds with an (adroit) initial guess for x, call this guess x0, and iterates xn+1 = xn − f(xn)/f ′ (xn), (9.9) for n =0, 1, 2 ..., whence—if the initial guess x0 is good enough—the sequence (xn) converges to the desired solution. So to reciprocate a real number a>0, one is trying to solve 1/x − a = 0, so that an appropriate iteration would be xn+1 =2xn − ax 2 n. (9.10) Assuming that this Newton iteration for reciprocals is successful (see Exercise 9.13), we see that the real number 1/a can be obtained to arbitrary accuracy with multiplies alone. To calculate a general real division b/a, onesimply multiplies b by the reciprocal 1/a, so that general division in real numbers can be done in this way via multiplies alone. But can the Newton method be applied to the problem of integer div? Indeed it can, provided that we proceed with care in the definition of a generalized reciprocal for integer division. We first introduce a function B(N), defined for nonnegative integers N as the number of bits in the binary representation of N, except that B(0) = 0. Thus, B(1) = 1,B(2) = B(3) = 2, and so on. Next we establish a generalized reciprocal; instead of reciprocals 1/a for real a, we consider a generalized reciprocal of integer N as the integer part of an appropriate large power of 2 divided by N. Definition 9.2.7. The generalized reciprocal R(N) is defined for positive integers N as ⌊4 B(N−1) /N ⌋.
- Page 408 and 409: 8.2 Random-number generation 399 Al
- Page 410 and 411: 8.2 Random-number generation 401 }
- Page 412 and 413: 8.2 Random-number generation 403 is
- Page 414 and 415: 8.3 Quasi-Monte Carlo (qMC) methods
- Page 416 and 417: 8.3 Quasi-Monte Carlo (qMC) methods
- Page 418 and 419: 8.3 Quasi-Monte Carlo (qMC) methods
- Page 420 and 421: 8.3 Quasi-Monte Carlo (qMC) methods
- Page 422 and 423: 8.3 Quasi-Monte Carlo (qMC) methods
- Page 424 and 425: 8.4 Diophantine analysis 415 [Tezuk
- Page 426 and 427: 8.4 Diophantine analysis 417 9262 3
- Page 428 and 429: 8.5 Quantum computation 419 We spea
- Page 430 and 431: 8.5 Quantum computation 421 three H
- Page 432 and 433: 8.5 Quantum computation 423 for a n
- Page 434 and 435: 8.6 Curious, anecdotal, and interdi
- Page 436 and 437: 8.6 Curious, anecdotal, and interdi
- Page 438 and 439: 8.6 Curious, anecdotal, and interdi
- Page 440 and 441: 8.7 Exercises 431 universal Golden
- Page 442 and 443: 8.7 Exercises 433 standards insist
- Page 444 and 445: 8.7 Exercises 435 of positive compo
- Page 446 and 447: 8.8 Research problems 437 element o
- Page 448 and 449: 8.8 Research problems 439 the Leveq
- Page 450 and 451: 8.8 Research problems 441 for every
- Page 452 and 453: Chapter 9 FAST ALGORITHMS FOR LARGE
- Page 454 and 455: 9.1 Tour of “grammar-school” me
- Page 456 and 457: 9.2 Enhancements to modular arithme
- Page 460 and 461: 9.2 Enhancements to modular arithme
- Page 462 and 463: 9.2 Enhancements to modular arithme
- Page 464 and 465: 9.2 Enhancements to modular arithme
- Page 466 and 467: 9.3 Exponentiation 457 Algorithm 9.
- Page 468 and 469: 9.3 Exponentiation 459 But there is
- Page 470 and 471: 9.3 Exponentiation 461 the benefit
- Page 472 and 473: 9.4 Enhancements for gcd and invers
- Page 474 and 475: 9.4 Enhancements for gcd and invers
- Page 476 and 477: 9.4 Enhancements for gcd and invers
- Page 478 and 479: 9.4 Enhancements for gcd and invers
- Page 480 and 481: 9.4 Enhancements for gcd and invers
- Page 482 and 483: 9.5 Large-integer multiplication 47
- Page 484 and 485: 9.5 Large-integer multiplication 47
- Page 486 and 487: 9.5 Large-integer multiplication 47
- Page 488 and 489: 9.5 Large-integer multiplication 47
- Page 490 and 491: 9.5 Large-integer multiplication 48
- Page 492 and 493: 9.5 Large-integer multiplication 48
- Page 494 and 495: 9.5 Large-integer multiplication 48
- Page 496 and 497: 9.5 Large-integer multiplication 48
- Page 498 and 499: 9.5 Large-integer multiplication 48
- Page 500 and 501: 9.5 Large-integer multiplication 49
- Page 502 and 503: 9.5 Large-integer multiplication 49
- Page 504 and 505: 9.5 Large-integer multiplication 49
- Page 506 and 507: 9.5 Large-integer multiplication 49
9.2 Enhancements to modular arithmetic 449<br />
Algorithm 9.2.5 (Montgomery product). This algorithm returns M(c, d)<br />
for integers 0 ≤ c, d < N, with N odd, and R =2 s >N.<br />
1. [Montgomery mod function M]<br />
M(c, d) {<br />
x = cd;<br />
z = y/R; // From Theorem 9.2.1.<br />
2. [Adjust result]<br />
if(z ≥ N) z = z − N;<br />
return z;<br />
}<br />
The [Adjust result] step in this algorithm always works because cd < RN by<br />
hypothesis. The only importance of the choice that R beapoweroftwois<br />
that fast arithmetic may be employed in the evaluation of z = y/R.<br />
Algorithm 9.2.6 (Montgomery powering). This algorithm returns<br />
x y mod N, for0 ≤ x0, andR chosen as in Algorithm 9.2.5. We<br />
denote by (y0,...,yD−1) the binary bits of y.<br />
1. [Initialize]<br />
x =(xR) modN; // Via some divide/mod method.<br />
p = R mod N; // Via some divide/mod method.<br />
2. [Power ladder]<br />
for(D − 1 ≥ j ≥ 0) {<br />
p = M(p, p); // Via Algorithm 9.2.5.<br />
if(yj == 1) p = M(p, x);<br />
} // Now p is x y .<br />
3. [Final extraction of power]<br />
return M(p, 1);<br />
Later in this chapter we shall have more to say about general power ladders;<br />
the ladder here is exhibited primarily to show how one may call the M()<br />
function to advantage.<br />
The speed enhancements of an eventual powering routine all center on the<br />
M() function, in particular on the computation of z = y/R. Wehavenoted<br />
that to get z, two multiplies are required, as in equation (9.7). But the story<br />
does not end here; in fact, the complexity of the Montgomery mod operation<br />
can be brought (asymptotically, large N) down to that of one size-N multiply.<br />
(To state it another way, the composite operation M(x ∗ y) asymptotically<br />
requires two size-N multiplies, which can be thought of as one for the “∗”<br />
operation.) The details of the optimizations are intricate, involving various<br />
manifestations of the inner multiply loops of the M() function [Koç etal.<br />
1996], [Bosselaers et al. 1994]. But these details stem at least in part from<br />
a wasted operation in equation (9.7): The right-shifting effectively destroys<br />
some of the bits generated by the two multiplies. We shall see this shifting<br />
phenomenon again in the next section. In actual program implementations<br />
of Montgomery’s scheme, one can assign a word-size base B =2 b ,sothat