Prime Numbers

Prime Numbers Prime Numbers

thales.doa.fmph.uniba.sk
from thales.doa.fmph.uniba.sk More from this publisher
10.12.2012 Views

9.7 Exercises 521 where “do” simply means one repeats what is in the braces for some appropriate total iteration count. Note that the duplication of the y iteration is √ intentional! Show that this scheme formally generates the binomial series of 1+a via the variable x. How many correct terms obtain after k iterations of the do loop? Next, calculate some real-valued square roots in this way, noting the important restriction that |a| cannot be too large, lest divergence occur (the formal correctness of the resulting series in powers of a does not, of course, automatically guarantee convergence). Then, consider this question: Can one use these ideas to create an algorithm for extracting integer square roots? This could be a replacement for Algorithm 9.2.11; the latter, we note, does involve explicit division. On this question it may be helpful to consider, for given n to be square-rooted, such as n/4q =2−q√n or some similar construct, to keep convergence under control. Incidentally, it is of interest that the standard, real-domain, Newton iteration for the inverse square root automatically has division-free form, yet we appear to be compelled to invoke such as the above coupled-variable expedient for a positive fractional power. 9.15. The Cullen numbers are Cn = n2 n +1. Write a Montgomery powering program specifically tailored to find composite Cullen numbers, via relations such as 2 Cn−1 ≡ 1(modCn). For example, within the powering algorithm for modulus N = C245 you would be taking say R =2 253 so that R>N. You could observe, for example, that C141 is a base-2 pseudoprime in this way (it is actually a prime). A much larger example of a Cullen prime is Wilfrid Keller’s C18496. For more on Cullen numbers see Exercise 1.83. 9.16. Say that we wish to evaluate 1/3 using the Newton reciprocation of the text (among real numbers, so that the result will be 0.3333 ...). For initial guess x0 =1/2, prove that for positive n the n-th iterate xn is in fact xn = 22n − 1 3 · 22n , in this way revealing the quadratic-convergence property of a successful Newton loop. The fact that a closed-form expression can even be given for the Newton iterates is interesting in itself. Such closed forms are rare—can you find any others? 9.17. Work out the asymptotic complexity of Algorithm 9.2.8, in terms of a size-N multiply, and assuming all the shifting enhancements discussed in the text. Then give the asymptotic complexity of the composite operation (xy) modN, for 0 ≤ x, y < N, in the case that the generalized reciprocal is not yet known. What is the complexity for (xy) modN if the reciprocal is known? (This should be asymptotically the same as the composite Montgomery operation (xy) modN if one ignores the precomputations attendant to the latter.) Incidentally, in actual programs that invoke the Newton–Barrett ideas,

522 Chapter 9 FAST ALGORITHMS FOR LARGE-INTEGER ARITHMETIC one can place within the general mod routine a check to see whether the reciprocal is known, and if it is not, then the generalized reciprocal algorithm is invoked, and so on. 9.18. Work out the asymptotic complexity of Algorithm 9.2.13 for given x, N in terms of a count of multiplications by integers c of various sizes. For example, assuming some grammar-school variant for multiplication, the bitcomplexity of an operation yc would be O(ln y ln c). Answer the interesting question: At what size of |c| (compared to N =2 q + c) is the special form reduction under discussion about as wasteful as some other prevailing schemes (such as long division, or the Newton–Barrett variants) for the mod operation? Incidentally, the most useful domain of applicability of the method is the case that c is one machine word in size. 9.19. Simplify algorithm 9.4.2 in the case that one does not need an extended solution ax + by = g, rather needs only the inverse itself. (That is, not all the machinations of the algorithm are really required.) 9.20. Implement the recursive gcd Algorithm 9.4.6. (Or, implement the newer Algorithm 9.4.7; see next paragraph.) Optimize the breakover parameters lim and prec for maximum speed in the calculation of rgcd(x, y) for each of x, y of various (approximately equal) sizes. You should be able to see rgcd() outperforming cgcd() in the region of, very roughly speaking, thousands of bits. (Note: Our display of Algorithm 9.4.6 is done in such a way that if the usual rules of global variables, such as matrix G, and variables local to procedures, such as the variables x, y in hgcd() and so on, are followed in the computer language, then transcription from our notation to a working program should not be too tedious.) As for Algorithm 9.4.7, the reader should find that different optimization issues accrue. For example, we found that Algorithm 9.4.6 typically runs faster if there is no good way to do such as trailing-zero detection and bit-shifting on huge numbers. On the other hand, when such expedients are efficient for the programmer, the newer Algorithm 9.4.7 should dominate. 9.21. Prove that Algorithm 9.2.10 works. Furthermore, work out a version that uses the shift-splitting idea embodied in the relation (9.12) and comments following. A good source for loop constructs in this regard is [Menezes et al. 1997]. Also, investigate the conjecture in [Oki 2003] that one may more tightly assign s =2B(N − 1) in Algorithm 9.2.10. 9.22. Prove that Algorithm 9.2.11 works. It helps to observe that x is definitely decreasing during the iteration loop. Then prove the O(ln ln N) estimate for the number of steps to terminate. Then invoke the idea of changing precision at every step, to show that the bit-complexity of a properly tuned algorithm can be brought down to O ln 2 N . Many of these ideas date back to the treatment in [Alt 1979].

522 Chapter 9 FAST ALGORITHMS FOR LARGE-INTEGER ARITHMETIC<br />

one can place within the general mod routine a check to see whether the<br />

reciprocal is known, and if it is not, then the generalized reciprocal algorithm<br />

is invoked, and so on.<br />

9.18. Work out the asymptotic complexity of Algorithm 9.2.13 for given<br />

x, N in terms of a count of multiplications by integers c of various sizes. For<br />

example, assuming some grammar-school variant for multiplication, the bitcomplexity<br />

of an operation yc would be O(ln y ln c). Answer the interesting<br />

question: At what size of |c| (compared to N =2 q + c) is the special form<br />

reduction under discussion about as wasteful as some other prevailing schemes<br />

(such as long division, or the Newton–Barrett variants) for the mod operation?<br />

Incidentally, the most useful domain of applicability of the method is the case<br />

that c is one machine word in size.<br />

9.19. Simplify algorithm 9.4.2 in the case that one does not need an extended<br />

solution ax + by = g, rather needs only the inverse itself. (That is, not all the<br />

machinations of the algorithm are really required.)<br />

9.20. Implement the recursive gcd Algorithm 9.4.6. (Or, implement the<br />

newer Algorithm 9.4.7; see next paragraph.) Optimize the breakover parameters<br />

lim and prec for maximum speed in the calculation of rgcd(x, y) for<br />

each of x, y of various (approximately equal) sizes. You should be able to see<br />

rgcd() outperforming cgcd() in the region of, very roughly speaking, thousands<br />

of bits. (Note: Our display of Algorithm 9.4.6 is done in such a way<br />

that if the usual rules of global variables, such as matrix G, and variables local<br />

to procedures, such as the variables x, y in hgcd() and so on, are followed<br />

in the computer language, then transcription from our notation to a working<br />

program should not be too tedious.)<br />

As for Algorithm 9.4.7, the reader should find that different optimization<br />

issues accrue. For example, we found that Algorithm 9.4.6 typically runs faster<br />

if there is no good way to do such as trailing-zero detection and bit-shifting<br />

on huge numbers. On the other hand, when such expedients are efficient for<br />

the programmer, the newer Algorithm 9.4.7 should dominate.<br />

9.21. Prove that Algorithm 9.2.10 works. Furthermore, work out a version<br />

that uses the shift-splitting idea embodied in the relation (9.12) and comments<br />

following. A good source for loop constructs in this regard is [Menezes et al.<br />

1997].<br />

Also, investigate the conjecture in [Oki 2003] that one may more tightly<br />

assign s =2B(N − 1) in Algorithm 9.2.10.<br />

9.22. Prove that Algorithm 9.2.11 works. It helps to observe that x is<br />

definitely decreasing during the iteration loop. Then prove the O(ln ln N)<br />

estimate for the number of steps to terminate. Then invoke the idea of<br />

changing precision at every step, to show that the bit-complexity of a properly<br />

tuned algorithm can be brought down to O ln 2 N . Many of these ideas date<br />

back to the treatment in [Alt 1979].

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!