Prime Numbers
Prime Numbers Prime Numbers
270 Chapter 6 SUBEXPONENTIAL FACTORING ALGORITHMS For a discussion of the conjugate gradient method and the Lanczos method, see [Odlyzko 1985]. For a study of the Lanczos method in a theoretical setting see [Teitelbaum 1998]. For some practical improvements to the Lanczos method see [Montgomery 1995]. 6.1.4 Large prime variations As discussed above and in Section 3.2.5, sieving is a very cheap operation. Unlike trial division, which takes time proportional to the number of trial divisors, that is, one “unit” of time per prime used as a trial, sieving takes less and less time per prime sieved as the prime modulus grows. In fact the time spent per sieve location, on average, for each prime modulus p is proportional to 1/p. However, there are hidden costs for increasing the list of primes p with which we sieve. One is that it is unlikely we can fit the entire sieve array into memory on a computer, so we segment it. If a prime p exceeds the length of thispartofthesieve,wehavetospendaunitoftimepersegmenttosee whether this prime will “hit” something or not. Thus, once the prime exceeds this threshold, the 1/p “philosophy” of the sieve is left behind, and we spend essentially the same time for each of these larger primes: Sieving begins to resemble trial division. Another hidden cost is perhaps not so hidden at all. When we turn to the linear-algebra stage of the algorithm, the matrix will be that much bigger if more primes are used. Suppose we are using 10 6 primes, a number that is not inconceivable for the sieving stage. The matrix, if encoded as a binary (0,1) matrix, would have 10 12 bits. Indeed, this would be a large object on which to carry out linear algebra! In fact, some of the linear algebra routines that will be used, see Section 6.1.3, involve a sparse encoding of the matrix, namely, a listing of where the 1’s appear, since almost all of the entries are 0’s. Nevertheless, space for the matrix is a worrisome concern, and it puts a limit on the size of the smoothness bound we take. The analysis in Section 6.1.1 indicates a third reason for not taking the smoothness bound too large; namely, it would increase the number of reports necessary to find a linear dependency. Somehow, though, this reason is specious. If there is already a dependency around with a subset of our data, having more data should not destroy this, but just make it a bit harder to find, perhaps. So we should not take an overshooting of the smoothness bound as a serious handicap if we can handle the two difficulties mentioned in the above paragraph. In its simplest form, the large-prime variation allows us a cheap way to somewhat increase our smoothness bound, by giving us for free many numbers that are almost B-smooth, but fail because they have one larger prime factor. This larger prime could be taken in the interval (B,B 2 ]. It should be noted from the very start that allowing for numbers that are B-smooth except for having one prime factor in the interval (B,B 2 ]isnot the same as taking B 2 -smooth numbers. With B about L(n) 1/2 , as suggested in Section 6.1.1, a typical B 2 -smooth number near n 1/2+ɛ in fact has many prime factors in the interval (B,B 2 ], not just one.
6.1 The quadratic sieve factorization method 271 Be that as it may, the large-prime variation does give us something that we did not have before. By allowing sieve reports of numbers that are close to the threshold for B-smoothness, but not quite there, we can discover numbers that have one slightly larger prime. In fact, if a number has all the primes up to B removed from its prime factorization, and the resulting number is smaller than B 2 , but larger than 1, then the resulting number must be a prime. It is this idea that is at work in the large-prime variation. Our sieve is not perfect, since we are using approximate logarithms and perhaps not sieving with small primes (see Section 3.2.5), but the added grayness does not matter much in the mass of numbers being considered. Some numbers with a large prime factor that might have been reported are possibly passed over, and some numbers are reported that should not have been, but neither problem is of great consequence. So if we can obtain these numbers with a large prime factor for free, how then can we process them in the linear algebra stage of the algorithm? In fact, we should not view the numbers with a large prime as having longer exponent vectors, since this could cause our matrix to be too large. There is a very cheap way to process these large prime reports. Simply sort them on the value of the large prime factor. If any large prime appears just once in the sorted list, then this number cannot possibly be used to make a square for us, so it is discarded. Say we have k reports with the same large prime: x 2 i − n = yiP ,fori =1, 2,...,k.Then (x1xi) 2 ≡ y1yiP 2 (mod n), for i =2,...,k. So when k ≥ 2 we can use the exponent vectors for the k − 1numbersy1yi, since the contribution of P 2 to the exponent vector, once it is reduced mod 2, is 0. That is, duplicate large primes lead to exponent vectors on the primes up to B. Since it is very fast to sort a list, the creation of these new exponent vectors is like a gift from heaven. There is one penalty to using these new exponent vectors, though it has not proved to be a big one. The exponent vector for a y1yi as above is usually not as sparse as an exponent vector for a fully smooth report. Thus, the matrix techniques that take advantage of sparseness are somewhat hobbled. Again, this penalty is not severe, and every important implementation of the QS method uses the large-prime variation. One might wonder how likely it is to have a pair of large primes matching. That is, when we sort our list, could it be that there are very few matches, and that almost everything is discarded because it appears just once? The birthday paradox from probability theory suggests that matches will not be uncommon, once one has plenty of large prime reports. In fact the experience that factorers have is that the importance of the large prime reports is nil near the beginning of the run, because there are very few matches, but as the data set gets larger, the effect of the birthday paradox begins, and the matches for the large primes blossom and become a significant source of rows for the final matrix.
- Page 230 and 231: 218 Chapter 4 PRIMALITY PROVING (2)
- Page 232 and 233: 220 Chapter 4 PRIMALITY PROVING hav
- Page 234 and 235: 222 Chapter 4 PRIMALITY PROVING sho
- Page 236 and 237: Chapter 5 EXPONENTIAL FACTORING ALG
- Page 238 and 239: 5.1 Squares 227 5.1.2 Lehman method
- Page 240 and 241: 5.2 Monte Carlo methods 229 That is
- Page 242 and 243: 5.2 Monte Carlo methods 231 It is c
- Page 244 and 245: 5.2 Monte Carlo methods 233 computi
- Page 246 and 247: 5.3 Baby-steps, giant-steps 235 cal
- Page 248 and 249: 5.4 Pollard p − 1 method 237 can
- Page 250 and 251: 5.6 Binary quadratic forms 239 f(jB
- Page 252 and 253: 5.6 Binary quadratic forms 241 so o
- Page 254 and 255: 5.6 Binary quadratic forms 243 equi
- Page 256 and 257: 5.6 Binary quadratic forms 245 is a
- Page 258 and 259: 5.6 Binary quadratic forms 247 In t
- Page 260 and 261: 5.6 Binary quadratic forms 249 of D
- Page 262 and 263: 5.7 Exercises 251 is completely rig
- Page 264 and 265: 5.7 Exercises 253 of each of these
- Page 266 and 267: 5.8 Research problems 255 5.17. Sho
- Page 268 and 269: 5.8 Research problems 257 modulo th
- Page 270 and 271: 5.8 Research problems 259 In judgin
- Page 272 and 273: 262 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 274 and 275: 264 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 276 and 277: 266 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 278 and 279: 268 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 282 and 283: 272 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 284 and 285: 274 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 286 and 287: 276 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 288 and 289: 278 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 290 and 291: 280 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 292 and 293: 282 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 294 and 295: 284 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 296 and 297: 286 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 298 and 299: 288 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 300 and 301: 290 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 302 and 303: 292 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 304 and 305: 294 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 306 and 307: 296 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 308 and 309: 298 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 310 and 311: 300 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 312 and 313: 302 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 314 and 315: 304 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 316 and 317: 306 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 318 and 319: 308 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 320 and 321: 310 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 322 and 323: 312 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 324 and 325: 314 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 326 and 327: 316 Chapter 6 SUBEXPONENTIAL FACTOR
- Page 328 and 329: Chapter 7 ELLIPTIC CURVE ARITHMETIC
270 Chapter 6 SUBEXPONENTIAL FACTORING ALGORITHMS<br />
For a discussion of the conjugate gradient method and the Lanczos<br />
method, see [Odlyzko 1985]. For a study of the Lanczos method in a theoretical<br />
setting see [Teitelbaum 1998]. For some practical improvements to the Lanczos<br />
method see [Montgomery 1995].<br />
6.1.4 Large prime variations<br />
As discussed above and in Section 3.2.5, sieving is a very cheap operation.<br />
Unlike trial division, which takes time proportional to the number of trial<br />
divisors, that is, one “unit” of time per prime used as a trial, sieving takes less<br />
and less time per prime sieved as the prime modulus grows. In fact the time<br />
spent per sieve location, on average, for each prime modulus p is proportional<br />
to 1/p. However, there are hidden costs for increasing the list of primes p with<br />
which we sieve. One is that it is unlikely we can fit the entire sieve array into<br />
memory on a computer, so we segment it. If a prime p exceeds the length of<br />
thispartofthesieve,wehavetospendaunitoftimepersegmenttosee<br />
whether this prime will “hit” something or not. Thus, once the prime exceeds<br />
this threshold, the 1/p “philosophy” of the sieve is left behind, and we spend<br />
essentially the same time for each of these larger primes: Sieving begins to<br />
resemble trial division. Another hidden cost is perhaps not so hidden at all.<br />
When we turn to the linear-algebra stage of the algorithm, the matrix will be<br />
that much bigger if more primes are used. Suppose we are using 10 6 primes, a<br />
number that is not inconceivable for the sieving stage. The matrix, if encoded<br />
as a binary (0,1) matrix, would have 10 12 bits. Indeed, this would be a large<br />
object on which to carry out linear algebra! In fact, some of the linear algebra<br />
routines that will be used, see Section 6.1.3, involve a sparse encoding of the<br />
matrix, namely, a listing of where the 1’s appear, since almost all of the entries<br />
are 0’s. Nevertheless, space for the matrix is a worrisome concern, and it puts<br />
a limit on the size of the smoothness bound we take.<br />
The analysis in Section 6.1.1 indicates a third reason for not taking<br />
the smoothness bound too large; namely, it would increase the number of<br />
reports necessary to find a linear dependency. Somehow, though, this reason<br />
is specious. If there is already a dependency around with a subset of our data,<br />
having more data should not destroy this, but just make it a bit harder to<br />
find, perhaps. So we should not take an overshooting of the smoothness bound<br />
as a serious handicap if we can handle the two difficulties mentioned in the<br />
above paragraph.<br />
In its simplest form, the large-prime variation allows us a cheap way to<br />
somewhat increase our smoothness bound, by giving us for free many numbers<br />
that are almost B-smooth, but fail because they have one larger prime factor.<br />
This larger prime could be taken in the interval (B,B 2 ]. It should be noted<br />
from the very start that allowing for numbers that are B-smooth except for<br />
having one prime factor in the interval (B,B 2 ]isnot the same as taking<br />
B 2 -smooth numbers. With B about L(n) 1/2 , as suggested in Section 6.1.1, a<br />
typical B 2 -smooth number near n 1/2+ɛ in fact has many prime factors in the<br />
interval (B,B 2 ], not just one.