10.12.2012 Views

Prime Numbers

Prime Numbers

Prime Numbers

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

9.5 Large-integer multiplication 485<br />

whence a final set of row FFTs gives<br />

T =<br />

X0 X2<br />

X1 X3<br />

where Xk = <br />

j xjg −jk are the usual DFT components, and we note that the<br />

final form here for T is again in columnwise order.<br />

Incidentally, if one wonders how this differs from a two-dimensional FFT<br />

such as an FFT in the field of image processing, the answer to that is simple:<br />

This four-step (or six-step, if pre- and post transposes are invoked to start<br />

with and end up with standard row-ordering) format involves that internal<br />

“twist,” or phase factor, in step [Transpose and twist the matrix]. A twodimensional<br />

FFT does not involve the phase-factor twisting step; instead, one<br />

simply takes FFTs of all rows in place, then all columns in place.<br />

Of course, with respect to repeated applications of Algorithm 9.5.7 the<br />

efficient option is simply this: Always store signals and their transforms in<br />

the columnwise format. Furthermore, one can establish a rule that for signal<br />

lengths N =2 n , we factor into matrix dimensions as W = H = √ N =2 n/2 for<br />

n even, but W =2H =2 (n+1)/2 for n odd. Then the matrix is square or almost<br />

square. Furthermore, for the inverse FFT, in which everything proceeds as<br />

above but with FFT −1 calls and the twisting phase uses g +JK , with a final<br />

division by N, one can conveniently assume that the width and height for<br />

this inverse case satisfy W ′ = H ′ or H ′ =2W ′ , so that in such as convolution<br />

problems the output matrix of the forward FFT is what is expected for the<br />

inverse FFT, even when said matrix is nonsquare. Actually, for convolutions<br />

per se there are other interesting optimizations due to J. Papadopoulos, such<br />

as the use of DIF/DIT frameworks and bit-scrambled powers of g; andavery<br />

fast large-FFT implementation of Mayer, in which one never transposes, using<br />

instead a fast, memory-efficient columnwise FFT stage; see [Crandall et al.<br />

1999].<br />

One interesting byproduct of this approach is that one is moved to study<br />

the basic problem of matrix transposition. The treatment in [Bailey 1989] gives<br />

an interesting small example of the algorithm in [Fraser 1976] for efficient<br />

transposition of a stored matrix, while the paper [Van Loan 1992, p. 138]<br />

indicates how active, really, is the ongoing study of fast transpose. Such an<br />

algorithm has applications in other aspects of large-integer arithmetic, for<br />

example see Section 9.5.7.<br />

We next turn to a development that has enjoyed accelerated importance<br />

since its discovery by pioneers A. Dutt and V. Rokhlin. A core result in their<br />

seminal paper [Dutt and Rokhlin 1993] involves a length-D, nonuniform FFT<br />

of the type<br />

Xk =<br />

D−1 <br />

j=0<br />

<br />

xje −2πikωj/D , (9.23)<br />

where all we know a priori about the (possibly nonuniform) frequencies ωj<br />

is that they all lie in [0,D). This form for Xk is to be compared with the

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!