Prime Numbers

Prime Numbers Prime Numbers

thales.doa.fmph.uniba.sk
from thales.doa.fmph.uniba.sk More from this publisher
10.12.2012 Views

270 Chapter 6 SUBEXPONENTIAL FACTORING ALGORITHMS For a discussion of the conjugate gradient method and the Lanczos method, see [Odlyzko 1985]. For a study of the Lanczos method in a theoretical setting see [Teitelbaum 1998]. For some practical improvements to the Lanczos method see [Montgomery 1995]. 6.1.4 Large prime variations As discussed above and in Section 3.2.5, sieving is a very cheap operation. Unlike trial division, which takes time proportional to the number of trial divisors, that is, one “unit” of time per prime used as a trial, sieving takes less and less time per prime sieved as the prime modulus grows. In fact the time spent per sieve location, on average, for each prime modulus p is proportional to 1/p. However, there are hidden costs for increasing the list of primes p with which we sieve. One is that it is unlikely we can fit the entire sieve array into memory on a computer, so we segment it. If a prime p exceeds the length of thispartofthesieve,wehavetospendaunitoftimepersegmenttosee whether this prime will “hit” something or not. Thus, once the prime exceeds this threshold, the 1/p “philosophy” of the sieve is left behind, and we spend essentially the same time for each of these larger primes: Sieving begins to resemble trial division. Another hidden cost is perhaps not so hidden at all. When we turn to the linear-algebra stage of the algorithm, the matrix will be that much bigger if more primes are used. Suppose we are using 10 6 primes, a number that is not inconceivable for the sieving stage. The matrix, if encoded as a binary (0,1) matrix, would have 10 12 bits. Indeed, this would be a large object on which to carry out linear algebra! In fact, some of the linear algebra routines that will be used, see Section 6.1.3, involve a sparse encoding of the matrix, namely, a listing of where the 1’s appear, since almost all of the entries are 0’s. Nevertheless, space for the matrix is a worrisome concern, and it puts a limit on the size of the smoothness bound we take. The analysis in Section 6.1.1 indicates a third reason for not taking the smoothness bound too large; namely, it would increase the number of reports necessary to find a linear dependency. Somehow, though, this reason is specious. If there is already a dependency around with a subset of our data, having more data should not destroy this, but just make it a bit harder to find, perhaps. So we should not take an overshooting of the smoothness bound as a serious handicap if we can handle the two difficulties mentioned in the above paragraph. In its simplest form, the large-prime variation allows us a cheap way to somewhat increase our smoothness bound, by giving us for free many numbers that are almost B-smooth, but fail because they have one larger prime factor. This larger prime could be taken in the interval (B,B 2 ]. It should be noted from the very start that allowing for numbers that are B-smooth except for having one prime factor in the interval (B,B 2 ]isnot the same as taking B 2 -smooth numbers. With B about L(n) 1/2 , as suggested in Section 6.1.1, a typical B 2 -smooth number near n 1/2+ɛ in fact has many prime factors in the interval (B,B 2 ], not just one.

6.1 The quadratic sieve factorization method 271 Be that as it may, the large-prime variation does give us something that we did not have before. By allowing sieve reports of numbers that are close to the threshold for B-smoothness, but not quite there, we can discover numbers that have one slightly larger prime. In fact, if a number has all the primes up to B removed from its prime factorization, and the resulting number is smaller than B 2 , but larger than 1, then the resulting number must be a prime. It is this idea that is at work in the large-prime variation. Our sieve is not perfect, since we are using approximate logarithms and perhaps not sieving with small primes (see Section 3.2.5), but the added grayness does not matter much in the mass of numbers being considered. Some numbers with a large prime factor that might have been reported are possibly passed over, and some numbers are reported that should not have been, but neither problem is of great consequence. So if we can obtain these numbers with a large prime factor for free, how then can we process them in the linear algebra stage of the algorithm? In fact, we should not view the numbers with a large prime as having longer exponent vectors, since this could cause our matrix to be too large. There is a very cheap way to process these large prime reports. Simply sort them on the value of the large prime factor. If any large prime appears just once in the sorted list, then this number cannot possibly be used to make a square for us, so it is discarded. Say we have k reports with the same large prime: x 2 i − n = yiP ,fori =1, 2,...,k.Then (x1xi) 2 ≡ y1yiP 2 (mod n), for i =2,...,k. So when k ≥ 2 we can use the exponent vectors for the k − 1numbersy1yi, since the contribution of P 2 to the exponent vector, once it is reduced mod 2, is 0. That is, duplicate large primes lead to exponent vectors on the primes up to B. Since it is very fast to sort a list, the creation of these new exponent vectors is like a gift from heaven. There is one penalty to using these new exponent vectors, though it has not proved to be a big one. The exponent vector for a y1yi as above is usually not as sparse as an exponent vector for a fully smooth report. Thus, the matrix techniques that take advantage of sparseness are somewhat hobbled. Again, this penalty is not severe, and every important implementation of the QS method uses the large-prime variation. One might wonder how likely it is to have a pair of large primes matching. That is, when we sort our list, could it be that there are very few matches, and that almost everything is discarded because it appears just once? The birthday paradox from probability theory suggests that matches will not be uncommon, once one has plenty of large prime reports. In fact the experience that factorers have is that the importance of the large prime reports is nil near the beginning of the run, because there are very few matches, but as the data set gets larger, the effect of the birthday paradox begins, and the matches for the large primes blossom and become a significant source of rows for the final matrix.

270 Chapter 6 SUBEXPONENTIAL FACTORING ALGORITHMS<br />

For a discussion of the conjugate gradient method and the Lanczos<br />

method, see [Odlyzko 1985]. For a study of the Lanczos method in a theoretical<br />

setting see [Teitelbaum 1998]. For some practical improvements to the Lanczos<br />

method see [Montgomery 1995].<br />

6.1.4 Large prime variations<br />

As discussed above and in Section 3.2.5, sieving is a very cheap operation.<br />

Unlike trial division, which takes time proportional to the number of trial<br />

divisors, that is, one “unit” of time per prime used as a trial, sieving takes less<br />

and less time per prime sieved as the prime modulus grows. In fact the time<br />

spent per sieve location, on average, for each prime modulus p is proportional<br />

to 1/p. However, there are hidden costs for increasing the list of primes p with<br />

which we sieve. One is that it is unlikely we can fit the entire sieve array into<br />

memory on a computer, so we segment it. If a prime p exceeds the length of<br />

thispartofthesieve,wehavetospendaunitoftimepersegmenttosee<br />

whether this prime will “hit” something or not. Thus, once the prime exceeds<br />

this threshold, the 1/p “philosophy” of the sieve is left behind, and we spend<br />

essentially the same time for each of these larger primes: Sieving begins to<br />

resemble trial division. Another hidden cost is perhaps not so hidden at all.<br />

When we turn to the linear-algebra stage of the algorithm, the matrix will be<br />

that much bigger if more primes are used. Suppose we are using 10 6 primes, a<br />

number that is not inconceivable for the sieving stage. The matrix, if encoded<br />

as a binary (0,1) matrix, would have 10 12 bits. Indeed, this would be a large<br />

object on which to carry out linear algebra! In fact, some of the linear algebra<br />

routines that will be used, see Section 6.1.3, involve a sparse encoding of the<br />

matrix, namely, a listing of where the 1’s appear, since almost all of the entries<br />

are 0’s. Nevertheless, space for the matrix is a worrisome concern, and it puts<br />

a limit on the size of the smoothness bound we take.<br />

The analysis in Section 6.1.1 indicates a third reason for not taking<br />

the smoothness bound too large; namely, it would increase the number of<br />

reports necessary to find a linear dependency. Somehow, though, this reason<br />

is specious. If there is already a dependency around with a subset of our data,<br />

having more data should not destroy this, but just make it a bit harder to<br />

find, perhaps. So we should not take an overshooting of the smoothness bound<br />

as a serious handicap if we can handle the two difficulties mentioned in the<br />

above paragraph.<br />

In its simplest form, the large-prime variation allows us a cheap way to<br />

somewhat increase our smoothness bound, by giving us for free many numbers<br />

that are almost B-smooth, but fail because they have one larger prime factor.<br />

This larger prime could be taken in the interval (B,B 2 ]. It should be noted<br />

from the very start that allowing for numbers that are B-smooth except for<br />

having one prime factor in the interval (B,B 2 ]isnot the same as taking<br />

B 2 -smooth numbers. With B about L(n) 1/2 , as suggested in Section 6.1.1, a<br />

typical B 2 -smooth number near n 1/2+ɛ in fact has many prime factors in the<br />

interval (B,B 2 ], not just one.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!