The Fast Fourier Transform; Short-time Fourier transform

The Fast Fourier Transform; Short-time Fourier transform The Fast Fourier Transform; Short-time Fourier transform

ee.columbia.edu
from ee.columbia.edu More from this publisher

ELEN E4810: Digital Signal Processing<br />

Topic 10:<br />

<strong>The</strong> <strong>Fast</strong> <strong>Fourier</strong> <strong>Transform</strong><br />

1. Calculation of the DFT<br />

2. <strong>The</strong> <strong>Fast</strong> <strong>Fourier</strong> <strong>Transform</strong> algorithm<br />

3. <strong>Short</strong>-Time <strong>Fourier</strong> <strong>Transform</strong><br />

Dan Ellis 2012-11-28<br />

1


1. Calculation of the DFT<br />

Filter design so far has been oriented to<br />

<strong>time</strong>-domain processing - cheaper!<br />

But: frequency-domain processing<br />

makes some problems very simple:<br />

X[k] <strong>Fourier</strong> domain Y[k]<br />

x[n] DFT IDFT y[n]<br />

processing<br />

use all of x[n], or use short-<strong>time</strong> windows<br />

Need an efficient way to calculate DFT<br />

Dan Ellis 2012-11-28<br />

2


Computational Complexity<br />

N1<br />

<br />

n=0<br />

X[k] = x[n]W N<br />

N complex multiplies<br />

+ N-1 complex adds per point (k)<br />

× N points (k = 0.. N-1)<br />

cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)<br />

= 4 real mults + 2 real adds<br />

cpx add = 2 real adds<br />

N points: 4N 2 real mults, 4N 2-2N real adds<br />

Dan Ellis 2012-11-28<br />

4<br />

kn


Goertzel’s Algorithm<br />

x e [n]<br />

x e [N] = 0<br />

Now:<br />

i.e.<br />

where<br />

+<br />

W N -k<br />

N1<br />

<br />

=0<br />

kN<br />

X[ k]<br />

= x[ ]WN<br />

= W N<br />

Dan Ellis 2012-11-28<br />

5<br />

k<br />

x[ ]<br />

WN X[ k]<br />

= y [ k N]<br />

y [ k n]<br />

= x [ e n]<br />

hk [n]<br />

z -1<br />

<br />

y k [n]<br />

y k [-1] = 0<br />

y k [N] = X[k]<br />

( )<br />

k N<br />

x e [n] = {<br />

looks like a<br />

convolution<br />

x[n] 0 ≤ n < N<br />

0 n = N<br />

h k [n] = { W N -kn n ≥ 0<br />

0 n < 0


H z<br />

( ) =<br />

Goertzel’s Algorithm<br />

Separate ‘filters’ for each X[k]<br />

can calculate for just a few values of k<br />

No large buffer, no coefficient table<br />

Same complexity for full X[k]<br />

(4N 2 mults, 4N 2 - 2N adds)<br />

but: can halve multiplies by making the<br />

denominator real:<br />

1<br />

1 W N<br />

k z 1 =<br />

1 W N<br />

k z 1<br />

1 2cos 2k<br />

N z1 + z 2<br />

evaluate only<br />

for last step<br />

Dan Ellis 2012-11-28<br />

6<br />

2 real mults<br />

per step


2. <strong>Fast</strong> <strong>Fourier</strong> <strong>Transform</strong> FFT<br />

Reduce complexity of DFT<br />

from O(N 2) to O(N·logN)<br />

grows more slowly with larger N<br />

Works by decomposing large DFT into<br />

several stages of smaller DFTs<br />

Often provided as a highly optimized<br />

library<br />

Dan Ellis 2012-11-28<br />

7


Decimation in Time (DIT) FFT<br />

Can rearrange DFT formula in 2 halves:<br />

N1<br />

<br />

X[ k]<br />

= x n<br />

k = 0.. N-1<br />

Arrange<br />

terms<br />

in pairs...<br />

n=0<br />

N<br />

2 1<br />

<br />

[ ] W N<br />

= x 2m<br />

m=0<br />

N<br />

2 1<br />

nk<br />

( [ ]<br />

2mk<br />

WN + x[ 2m +1]<br />

( 2m+1)k<br />

W ) N<br />

N<br />

21 <br />

Group terms<br />

from each<br />

pair<br />

= x[ 2m]<br />

mk k<br />

WN + WN x[ 2m +1]<br />

mk<br />

WN<br />

2<br />

2<br />

m=0<br />

m=0<br />

X0 [ N/2 ] X1 [ N/2 ]<br />

N/2 pt DFT of x for even n N/2 pt DFT of x for odd n<br />

Dan Ellis 2012-11-28<br />

8


One-Stage DIT Flowgraph<br />

X[ k]<br />

= X [ 0 k ] N + WN Even<br />

points<br />

from<br />

x[n]<br />

Odd<br />

points<br />

from<br />

x[n]<br />

x[0]<br />

x[2]<br />

x[4]<br />

x[6]<br />

x[1]<br />

x[3]<br />

x[5]<br />

x[7]<br />

2<br />

DFT N2<br />

DFT N2<br />

[ ]<br />

k<br />

X1 k N<br />

2<br />

X 0[0]<br />

X 0[1]<br />

X 0 [2]<br />

X 0 [3]<br />

X 1 [0]<br />

X 1[1]<br />

X 1[2]<br />

X 1 [3]<br />

W N 0<br />

W N 1<br />

W N 2<br />

W N 3<br />

W N 4<br />

W N 5<br />

W N 6<br />

W N 7<br />

Classic FFT structure<br />

“twiddle factors”:<br />

always apply to<br />

odd-terms output<br />

NOT mirror-image<br />

X[0]<br />

X[1]<br />

X[2]<br />

X[3]<br />

X[4]<br />

X[5]<br />

X[6]<br />

X[7]<br />

Dan Ellis 2012-11-28<br />

10<br />

Same as<br />

X[0..3]<br />

except for<br />

factors on<br />

X 1 [•]<br />

terms


Multiple DIT Stages<br />

If decomposing one DFTN into two<br />

smaller DFTN/2 ’s speeds things up ...<br />

Why not further divide into DFTN/4 ’s ?<br />

i.e. X[ k]<br />

= X [ 0 k ] N + W k [ N X1 k ] N<br />

make:<br />

0 ≤ k < N<br />

Similarly,<br />

X [ 0 k]<br />

= X00 k N<br />

Dan Ellis 2012-11-28<br />

11<br />

2<br />

[ ]<br />

0 ≤ k < N/2<br />

N/4-pt DFT of even points<br />

in even subset of x[n]<br />

X [ 1 k]<br />

= X10 k N<br />

4<br />

2<br />

[ ]<br />

k<br />

+ WN X01 k N<br />

2<br />

4<br />

N/4-pt DFT of odd points<br />

from even subset<br />

[ ]<br />

4<br />

[ ]<br />

k<br />

+ WN X11 k N<br />

2<br />

4


Two-Stage DIT Flowgraph<br />

x[0]<br />

x[4]<br />

x[2]<br />

x[6]<br />

x[1]<br />

x[5]<br />

x[3]<br />

x[7]<br />

different from before X same as before<br />

0[0]<br />

DFT W 0<br />

N/2<br />

N4 X00 X0 [1] W 0<br />

N<br />

X0[2] W 1<br />

N<br />

DFTN4 X01 X0[3] W 2<br />

N<br />

W 3<br />

W 3<br />

N/2<br />

N<br />

DFT N4<br />

DFT N4<br />

X 10<br />

X 11<br />

W 0<br />

N/2<br />

W 3<br />

N/2<br />

X 1[0]<br />

X 1[1]<br />

X 1 [2]<br />

X 1 [3]<br />

W N 4<br />

W N 5<br />

W N 6<br />

W N 7<br />

X[0]<br />

X[1]<br />

X[2]<br />

X[3]<br />

X[4]<br />

X[5]<br />

X[6]<br />

X[7]<br />

Dan Ellis 2012-11-28<br />

12


Multi-stage DIT FFT<br />

Can keep doing this until we get down<br />

to 2-pt DFTs:<br />

DFT 2<br />

X[0] = x[0] + x[1]<br />

X[1] = x[0] - x[1]<br />

≡<br />

“butterfly” element<br />

1 = W 2 0<br />

-1 = W 2 1<br />

→ N = 2 M-pt DFT reduces to M stages of<br />

twiddle factors & summation<br />

(O(N 2) part vanishes)<br />

→ real mults < M·4N , real adds < 2M·2N<br />

→ complexity ~ O(N·M) = O(N·log 2 N)<br />

Dan Ellis 2012-11-28<br />

13


FFT Implementation Details<br />

Basic butterfly (at any stage):<br />

XX0 [r]<br />

•<br />

XX1 [r]<br />

Can simplify:<br />

X X0 [r]<br />

XX1 [r]<br />

W r<br />

N -1<br />

W N r<br />

W N r+N/2<br />

XX [r]<br />

•<br />

2 cpx mults<br />

•<br />

XX [r+N/2]<br />

X X [r]<br />

r+<br />

WN N j<br />

2 = e<br />

2 ( r+N 2 )<br />

N<br />

2N /2<br />

j2r j<br />

= e N e N<br />

r<br />

= WN just one cpx mult!<br />

XX [r+N/2]<br />

i.e. SUB rather than ADD<br />

Dan Ellis 2012-11-28<br />

14


it-reversed indexing<br />

8-pt DIT FFT Flowgraph<br />

000<br />

100<br />

010<br />

110<br />

001<br />

101<br />

011<br />

111<br />

x[0]<br />

x[4]<br />

x[2]<br />

x[6]<br />

x[1]<br />

x[5]<br />

x[3]<br />

x[7]<br />

W 4<br />

W 4<br />

-1’s absorbed into summation nodes<br />

W N 0 disappears<br />

-<br />

-<br />

-<br />

-<br />

W8 W8 W8 ‘in-place’ algorithm: sequential stages<br />

Dan Ellis 2012-11-28<br />

15<br />

-<br />

-<br />

-<br />

-<br />

2<br />

3<br />

-<br />

-<br />

-<br />

-<br />

X[0]<br />

X[1]<br />

X[2]<br />

X[3]<br />

X[4]<br />

X[5]<br />

X[6]<br />

X[7]


FFT for Other Values of N<br />

Having N = 2 M meant we could divide<br />

each stage into 2 halves = “radix-2 FFT”<br />

Same approach works for:<br />

N = 3 M radix-3<br />

N = 4 M radix-4 - more optimized radix-2<br />

etc...<br />

Composite N = a·b·c·d → mixed radix<br />

(different N/r point FFTs at each stage)<br />

.. or just zero-pad to make N = 2 M<br />

Dan Ellis 2012-11-28<br />

16<br />

M


x n<br />

[ ] = 1 N<br />

Inverse FFT<br />

Recall IDFT:<br />

Thus:<br />

N1<br />

x[n] = 1<br />

N<br />

( ) *<br />

N1<br />

<br />

k=0<br />

only differences<br />

from forward DFT<br />

nk<br />

X[k]W N<br />

Nx * [n] = <br />

nk<br />

X[k]W N = X * [k]W N<br />

k=0<br />

Forward DFT of x′[n] = X *[k]| k=n<br />

i.e. <strong>time</strong> sequence made from spectrum<br />

N1<br />

<br />

k=0<br />

Hence, use FFT to calculate IFFT:<br />

N 1<br />

k=0<br />

X * [<br />

nk<br />

k]WN<br />

*<br />

Re{X[k]}<br />

Im{X[k]}<br />

pure real flowgraph<br />

Re Re<br />

1/N<br />

DFT<br />

Dan Ellis 2012-11-28<br />

17<br />

-1<br />

Im<br />

Im<br />

-1/N<br />

nk<br />

Re{x[n]}<br />

Im{x[n]}


3. <strong>Short</strong>-Time<br />

<strong>Fourier</strong> <strong>Transform</strong> (STFT)<br />

<strong>Fourier</strong> <strong>Transform</strong> (e.g. DTFT) gives<br />

spectrum of an entire sequence:<br />

How to see a <strong>time</strong>-varying spectrum?<br />

e.g. slow AM of a sinusoid carrier:<br />

x[ n]<br />

= 1 cos 2n <br />

cos 0n N <br />

-2<br />

0 200 400 600 800 1000<br />

Dan Ellis 2012-11-28<br />

19<br />

2<br />

1<br />

0<br />

-1<br />

x[n]<br />

n


<strong>Fourier</strong> <strong>Transform</strong> of AM Sine<br />

Spectrum of<br />

whole sequence<br />

indicates<br />

modulation<br />

indirectly...<br />

... as<br />

cancellation<br />

between<br />

closelytuned<br />

sines<br />

600<br />

400<br />

200<br />

Nsin2ºkn<br />

N<br />

-Nsin2º(k-1)n<br />

2 N<br />

-Nsin2º(k+1)n<br />

2 N<br />

X[k]<br />

0<br />

0 0.02 0.04 0.06 0.08<br />

Dan Ellis 2012-11-28<br />

20<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

1<br />

0.5<br />

0<br />

-0.5<br />

-1<br />

1<br />

0.5<br />

0<br />

-0.5<br />

N<br />

N/2<br />

2cAcB<br />

= cA+B<br />

+cA-B<br />

<br />

k/(N/2)<br />

-1<br />

0 128 256 384 512 640 768 896<br />

M


<strong>Fourier</strong> <strong>Transform</strong> of AM Sine<br />

Some<strong>time</strong>s we’d rather separate<br />

modulation and carrier:<br />

x[n] = A[n]cos! 0 n<br />

A[n] varies on a<br />

different (slower) <strong>time</strong>scale<br />

One approach:<br />

chop x[n] into short sub-sequences ..<br />

.. where slow modulator is ~ constant<br />

DFT spectrum of pieces → show variation<br />

Dan Ellis 2012-11-28<br />

21<br />

! 0<br />

A[n]<br />

!


FT of <strong>Short</strong> Segments<br />

Break up x[n] into successive, shorter<br />

chunks of length NFT , then DFT each:<br />

2<br />

1<br />

0<br />

-1<br />

-2<br />

0 128 256 384 512 640 768 896 1024 = N<br />

100<br />

50<br />

x[n]<br />

N FT<br />

= N/8<br />

x0 [n]<br />

X 0 [k]<br />

0<br />

0 64<br />

Shows amplitude modulation<br />

of ! 0 energy<br />

k<br />

x 1 [n]<br />

X 1 [k]<br />

x 2 [n]<br />

X 2 [k]<br />

x 3 [n]<br />

X 3 [k]<br />

x 4 [n]<br />

X 4 [k]<br />

k 0 = 0 · N FT<br />

2<br />

x 5 [n]<br />

X 5 [k]<br />

x 6 [n]<br />

X 6 [k]<br />

x 7 [n]<br />

X 7 [k]<br />

Dan Ellis 2012-11-28<br />

22<br />

k<br />

n


<strong>The</strong> Spectrogram<br />

Plot successive DFTs in <strong>time</strong>-frequency:<br />

X i[k]<br />

X[k,n]<br />

k<br />

X 0 [k] X 1 [k] X 2 [k] X 3 [k] X 4 [k] X 5 [k] X 6 [k] X 7 [k]<br />

15<br />

10<br />

5<br />

0<br />

0<br />

k k<br />

128 256 384 512 640 768 896 1024<br />

<strong>time</strong> hopsize (between successive frames)<br />

= 128 points<br />

This image is called the Spectrogram<br />

Dan Ellis 2012-11-28<br />

23<br />

n<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

0


<strong>Short</strong>-Time <strong>Fourier</strong> <strong>Transform</strong><br />

Spectrogram = STFT magnitude<br />

plotted on <strong>time</strong>-frequency plane<br />

STFT is (DFT form):<br />

frequency<br />

index<br />

X[ k,n ] 0 = x[ n0 + n]<br />

w[n] e<br />

<strong>time</strong><br />

index<br />

N FT 1<br />

<br />

n=0<br />

N FT points of x<br />

starting at n<br />

j 2kn<br />

N FT<br />

window DFT<br />

kernel<br />

intensity as a function of <strong>time</strong> & frequency<br />

Dan Ellis 2012-11-28<br />

24


DTFT<br />

form of<br />

STFT<br />

STFT Window Shape<br />

w[n] provides ‘<strong>time</strong> localization’ of STFT<br />

e.g. rectangular<br />

w[n]<br />

selects x[n], n 0 ≤ n < n 0 +N W<br />

But: resulting spectrum has same<br />

problems as windowing for FIR design:<br />

X e j ( ,n ) 0 = DTFT{ x[ n0 + n]<br />

w[ n]<br />

}<br />

= e jn <br />

<br />

0 j<br />

X( j( )<br />

e )W( e )d<br />

<br />

spectrum of short-<strong>time</strong> window<br />

is convolved with (twisted) parent spectrum<br />

Dan Ellis 2012-11-28<br />

25<br />

n


STFT Window Shape<br />

e.g. if x[n] is a pure sinusoid,<br />

X(e<br />

<br />

j) W(ej) blurring (mainlobe)<br />

+ ghosting (sidelobes)<br />

Hence, use tapered window for w[n]<br />

e.g. Hamming<br />

w[ n]<br />

=<br />

0.54 + 0.46 cos(2 n<br />

2M +1 )<br />

<br />

<br />

w[n] W(e j)<br />

Dan Ellis 2012-11-28<br />

26<br />

<br />

<br />

sidelobes<br />

< -40 dB<br />

n<br />

<br />

-10 -5 0 5 10


STFT Window Length<br />

Can illustrate <strong>time</strong>-frequency tradeoff<br />

on the <strong>time</strong>-frequency plane:<br />

k<br />

250<br />

200<br />

150<br />

100<br />

50<br />

0<br />

1<br />

0.5<br />

0<br />

0 100 200 300<br />

Alternate tilings<br />

of <strong>time</strong>-freq:<br />

n<br />

disks show ‘blurring’<br />

due to window length;<br />

area of disk is constant<br />

→ Uncertainty principle:<br />

±f·±t ≥ k<br />

half-length window → half as many DFT samples<br />

Dan Ellis 2012-11-28<br />

28


freq / Hz<br />

freq / Hz<br />

0.1<br />

0<br />

4000<br />

3000<br />

2000<br />

1000<br />

4000<br />

3000<br />

2000<br />

1000<br />

Spectrograms of Real Sounds<br />

0<br />

2.35 2.4 2.45 2.5 2.55 2.6<br />

0<br />

0 0.5 1 1.5 2 2.5<br />

<strong>time</strong> / s<br />

<strong>time</strong> / s<br />

<strong>time</strong>-domain<br />

successive<br />

short<br />

DFTs<br />

individual t-f<br />

cells merge<br />

into continuous<br />

image<br />

Dan Ellis 2012-11-28<br />

29<br />

10<br />

0<br />

-10<br />

-20<br />

-30<br />

-40<br />

-50<br />

intensity / dB


Window = 256 pt<br />

Narrowband<br />

Window = 48 pt<br />

Wideband<br />

Narrowband vs. Wideband<br />

Effect of varying window length:<br />

freq / Hz<br />

freq / Hz<br />

0.2<br />

0<br />

4000<br />

3000<br />

2000<br />

1000<br />

0<br />

4000<br />

3000<br />

2000<br />

1000<br />

0<br />

1.4 1.6 1.8 2 2.2 2.4 2.6<br />

<strong>time</strong> / s<br />

Dan Ellis 2012-11-28<br />

30<br />

10<br />

0<br />

-10<br />

-20<br />

-30<br />

-40<br />

-50<br />

level<br />

/ dB<br />

M


Frequency<br />

8000<br />

6000<br />

4000<br />

2000<br />

Spectrogram in Matlab<br />

>> [d,sr]=wavread(’mpgr1_sx419.wav');<br />

>> Nw=256;<br />

>> specgram(d,Nw,sr)<br />

>> caxis([-80 0])<br />

>> colorbar<br />

0<br />

(hann) window length<br />

actual sampling rate<br />

(to label <strong>time</strong> axis)<br />

0.5 1 1.5<br />

Time<br />

2 2.5 3<br />

Dan Ellis 2012-11-28<br />

31<br />

0<br />

-20<br />

-40<br />

-60<br />

-80<br />

dB


just one freq.<br />

STFT as a Filterbank<br />

Consider one ‘row’ of STFT:<br />

Dan Ellis<br />

X k n 0<br />

where<br />

N1<br />

<br />

n=0<br />

N1<br />

[ ] = x[ n0 + n]<br />

w[ n]<br />

e<br />

( )<br />

<br />

= h [ k m]x<br />

n0 m<br />

m=0<br />

2012-11-28<br />

[ ]<br />

h k n<br />

[ ] = w n<br />

[ ] e<br />

j 2kn<br />

N<br />

j 2kn<br />

N<br />

Each STFT row is output of a filter<br />

(subsampled by the STFT hop size)<br />

Im{x[n]}<br />

1<br />

0<br />

-1<br />

-60<br />

convolution<br />

with<br />

complex IR<br />

-40<br />

n<br />

-20<br />

32<br />

0 -1<br />

0<br />

1<br />

Re{x[n]}


STFT as a Filterbank<br />

If<br />

then<br />

h [ k n]<br />

= w[ ( )n]<br />

e<br />

Hk e j<br />

( ) = W e <br />

j 2kn<br />

N<br />

( ) j 2k<br />

( ( N ) ) shift-in-!<br />

Each STFT row is the same bandpass<br />

response defined by W(ej!), frequency-shifted to a given DFT bin:<br />

W(ej)<br />

<br />

H1(ej) H2(ej) • • •<br />

A bank of identical,<br />

frequency-shifted<br />

bandpass filters:<br />

“filterbank”<br />

Dan Ellis 2012-11-28<br />

33


STFT Analysis-Synthesis<br />

IDFT of STFT frames can reconstruct<br />

(part of) original waveform<br />

e.g. if<br />

then<br />

X[ k,n ] 0 = DFT x[ n0 + n]<br />

w[ n]<br />

{ } = x[ n0 + n]<br />

w n<br />

IDFT X[ k,n ] 0<br />

{ }<br />

Can shift by n0 , combine, to get x[n]: ^<br />

^x[n]<br />

[ ]<br />

Could divide by w[n-n 0 ] to recover x[n]...<br />

Dan Ellis 2012-11-28<br />

34<br />

n0<br />

x[n]·w[n-n 0]<br />

n


STFT Analysis-Synthesis<br />

Dividing by small values of w[n] is bad<br />

Prefer to<br />

overlap windows:<br />

i.e. sample X[k,n 0 ]<br />

at n 0 = r·H where H = N/2 (for example)<br />

<strong>The</strong>n<br />

^x[n]<br />

hopsize window length<br />

x ˆ [ n]<br />

= x[ n]w[<br />

n rH]<br />

r<br />

= x[ n]<br />

if w[ n rH]<br />

=1<br />

Dan Ellis 2012-11-28<br />

35<br />

r<br />

x[n]·w[n-r·H]<br />

n


STFT Analysis-Synthesis<br />

Hann or Hamming windows<br />

with 50% overlap<br />

1<br />

sum to constant<br />

0.54 + 0.46cos(2 n N )<br />

( )<br />

N n 2<br />

N )<br />

( ) =1.08 0.4<br />

0.2<br />

0<br />

0 20 40 60 80<br />

+ 0.54 + 0.46cos(2<br />

Can modify individual frames of X[k,n]<br />

and then reconstruct<br />

complex, <strong>time</strong>-varying modifications<br />

tapered overlap makes things OK<br />

Dan Ellis 2012-11-28<br />

36<br />

0.8<br />

0.6<br />

w[n] + w[n-N/2]<br />

w[n] w[n-N/2]<br />

n


STFT Analysis-Synthesis<br />

e.g. Noise reduction:<br />

STFT of<br />

original speech<br />

Speech corrupted<br />

by white noise<br />

Energy threshold<br />

mask<br />

k<br />

100 200 300<br />

Dan Ellis 2012-11-28<br />

37<br />

120<br />

100<br />

80<br />

60<br />

40<br />

20<br />

r<br />

M

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!