27.12.2012 Views

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

3.4.2 Analysis of running speech<br />

Speech research 31<br />

The analysis was only carried out on connected intervals classified as voiced of minimum<br />

length 70 ms. A weighting by length is implemented, since longer intervals are<br />

more expressive. In these intervals period markers are set (Fig. 3 (11)) <strong>and</strong> acoustic<br />

quantities are determined, for instance:<br />

• period lengths by the waveform matching algorithm;<br />

• jitter (3 definitions), shimmer (3 definitions), MWC, GNE.<br />

From these again a GHD can be constructed. The position of voices in the GHD is<br />

different in running speech from that for stationary vowels, so that a new calibration<br />

is required to obtain comparable representations for both cases. The definition of the<br />

axes, which is based on a principal-component analysis in a high-dimensional space,<br />

has to be carried out anew. Here, the choice of the underlying quantities was the same<br />

for consistency reasons, but their weighting was different. The new GHD is called<br />

“GHDT”, “T” meaning “text”. The coordinates in the GHDT are averaged over the<br />

analyzed intervals of the text utterance, weighted by their lengths. The variances<br />

of the measurement points in the GHD are, because of sound dependence, of course<br />

larger than for stationary vowels, but the mean values retain their expressiveness.<br />

The consistency of the GHDT was checked with various normal <strong>and</strong> pathological<br />

voices <strong>and</strong> different utterances.<br />

Besides the GHD, the automatic voiced/unvoiced classification can also be applied<br />

to other diagnostically useful quantities in order to extend their usage to running<br />

speech. This concerns, for instance, the Pitch Amplitude (PA; 1 st maximum of the<br />

autocorrelation function of the prediction error signal) <strong>and</strong> the Spectral Flatness<br />

Ratio (SFR; logarithm of the ratio of geometric <strong>and</strong> arithmetic means of the spectral<br />

energy density of the prediction error signal).<br />

Based on the acoustic quantities, group analyses of various phonation mechanisms<br />

<strong>and</strong> cancer groups (significant group separation) can be conducted. For preliminary<br />

<strong>and</strong> recent presentations of methods <strong>and</strong> results see Refs. [25–28].<br />

So far, no phonemes were to be recognized but only their linguistic (not actual)<br />

voicedness. Meanwhile, the perceptron method has been extended to recognition of<br />

the six stationary vowels, using 6 output cells. Training was done with 8192 vowels<br />

of at least 2 s duration from all kinds of voice quality. This can help to further<br />

automatize the determination of voice quality.<br />

3.5 Analysis of glottal oscillation<br />

The voice pathologies are related to the functioning of the vocal folds, which form a<br />

self-oscillating nonlinear mechanic <strong>and</strong> aerodynamic system driven by the glottal air<br />

flow. In order to relate the acoustic voice characteristics to properties of the glottal<br />

oscillation, these must be (if possible, automatically) recorded <strong>and</strong> characterized by<br />

few quantities. Here, acoustic as well as optical methods are employed. These methods<br />

have not yet been extended to running speech, but the only essential difficulty<br />

to do so appears to be the large amount of data occurring then.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!