18.07.2013 Views

Time Domain Methods in Speech Processing General Synthesis ...

Time Domain Methods in Speech Processing General Synthesis ...

Time Domain Methods in Speech Processing General Synthesis ...

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

Short-<strong>Time</strong> Energy, Magnitude, ZC<br />

Summary of Simple <strong>Time</strong> <strong>Doma<strong>in</strong></strong><br />

Measures<br />

xˆ ( n)<br />

L<strong>in</strong>ear x(n) T[x(n)] Lowpass ˆn<br />

T[ ]<br />

Filter<br />

Filter, w(n)<br />

∞<br />

∑<br />

Qnˆ= T( x[ m]) w[ nˆ−m] m=−∞<br />

1. Energy:<br />

Enˆ= nˆ<br />

∑<br />

m= nˆ− L+<br />

1<br />

2<br />

x [ m] w[ nˆ−m] i can downsample Enˆ<br />

at rate commensurate with w<strong>in</strong>dow bandwidth<br />

2. Magnitude:<br />

Mnˆ= nˆ<br />

∑<br />

m= nˆ− L+<br />

1<br />

x[ m] w[ nˆ−m] 3. Zero Cross<strong>in</strong>g Rate:<br />

1<br />

Znˆ= z1= ∑ sgn( x − −<br />

2L<br />

= − +<br />

= ≥<br />

=− <<br />

ˆ n<br />

[ m]) sgn( x[ m 1])<br />

m nˆ L 1<br />

where sgn( xm [ ]) 1 xm [ ] 0<br />

1 xm [ ] 0<br />

Periodic Signals<br />

• for a periodic signal we have (at least <strong>in</strong><br />

theory) Φ[P]=Φ[0] so the period of a<br />

periodic signal can be estimated as the<br />

first non-zero maximum of Φ[k]<br />

– this means that the autocorrelation function is<br />

a good candidate for speech pitch detection<br />

algorithms<br />

– it also means that we need a good way of<br />

measur<strong>in</strong>g the short-time autocorrelation<br />

function for speech signals<br />

Q<br />

Issues <strong>in</strong> ZC Rate Computation<br />

• for zero cross<strong>in</strong>g rate to be accurate, need zero<br />

DC <strong>in</strong> signal => need to remove offsets, hum,<br />

noise => use bandpass filter to elim<strong>in</strong>ate DC and<br />

hum<br />

• can quantize the signal to 1-bit for computation<br />

of ZC rate<br />

• can apply the concept of ZC rate to bandpass<br />

filtered speech to give a ‘crude’ spectral estimate<br />

<strong>in</strong> narrow bands of speech (k<strong>in</strong>d of gives an<br />

estimate of the strongest frequency <strong>in</strong> each<br />

narrow band of speech)<br />

43 44<br />

45<br />

47<br />

Short-<strong>Time</strong> Autocorrelation<br />

-for a determ<strong>in</strong>istic signal, the autocorrelation function is def<strong>in</strong>ed as:<br />

∞<br />

∑<br />

Φ[ k] = x[ m] x[ m+ k]<br />

m=−∞<br />

-for a random or periodic signal, the autocorrelation function is:<br />

L 1<br />

Φ[ k] = lim ∑ x[ m] x[ m+ k]<br />

N→∞<br />

( 2L+ 1)<br />

m=−L - if x[<br />

n] = x[ n+ P], then Φ[ k] = Φ[ k + P],<br />

=> the autocorrelation function<br />

preserves periodicity<br />

-properties of Φ[ k]<br />

:<br />

1.<br />

Φ[ k] is even, Φ[ k] = Φ[ −k]<br />

2. Φ[ k] is maximum at k = 0, | Φ[ k] | ≤Φ[ 0],<br />

∀k<br />

3. Φ[ 0]<br />

is the signal energy or power<br />

(for random signals)<br />

Short-<strong>Time</strong> Autocorrelation<br />

- a reasonable def<strong>in</strong>ition for the short-time autocorrelation is:<br />

∞<br />

R ˆ ˆ<br />

nˆ<br />

[ k] = ∑ x[ mw ] [ n− m] x[ m+ k] w[ n−k −m]<br />

m=−∞<br />

1. select a segment of speech by w<strong>in</strong>dow<strong>in</strong>g<br />

2. compute determ<strong>in</strong>istic autocorrelation of the w<strong>in</strong>dowed speech<br />

Rnˆ[ k] = Rnˆ[ −k]<br />

- symmetry<br />

∞<br />

= ∑ xmxm [ ] [ −k] ⎡⎣wn [ ˆ− mwn ] [ ˆ+<br />

k−m] ⎤⎦<br />

m=−∞<br />

- def<strong>in</strong>e filter of the form<br />

h [ ˆ] = [ ˆ] [ ˆ<br />

k n w n w n+ k]<br />

- this enables us to write the short-time autocorrelation <strong>in</strong> the form:<br />

∞<br />

R ˆ<br />

nˆ<br />

[ k] = ∑ xmxm [ ] [ −k] hk[ n−m] m=−∞<br />

th<br />

- the value of R ˆ<br />

nˆ<br />

[ k] at time n for the k lag is obta<strong>in</strong>ed by filter<strong>in</strong>g<br />

the sequence xn [ ˆ] xn [ ˆ − k] with a filter with impulse<br />

response h [ ˆ k n]<br />

46<br />

48<br />

8

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!