27.12.2012 Views

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

Oscillations, Waves, and Interactions - GWDG

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

<strong>Oscillations</strong>, <strong>Waves</strong> <strong>and</strong> <strong>Interactions</strong>, pp. 25–36<br />

edited by T. Kurz, U. Parlitz, <strong>and</strong> U. Kaatze<br />

Universitätsverlag Göttingen (2007) ISBN 978–3–938616–96–3<br />

urn:nbn:de:gbv:7-verlag-1-02-4<br />

Speech research with physical methods<br />

Hans Werner Strube<br />

Drittes Physikalisches Institut, Georg-August-Universität Göttingen<br />

Friedrich-Hund-Platz 1, 37077 Göttingen, Germany<br />

Abstract. An overview of some recent work in speech research at the Dritte Physikalische<br />

Institut is given, especially of investigations from a cooperation between physics <strong>and</strong> phoniatrics<br />

that concern the analysis of pathologic voices by acoustic <strong>and</strong> optical means. The<br />

main novel points are the extension of our own previously published acoustic methods to<br />

running speech <strong>and</strong> new high-speed video methods.<br />

1 Overview<br />

Recent work at the Dritte Physikalische Institut may be divided in two thematic<br />

fields:<br />

• Work related to speech recognition.<br />

• Acoustic analysis of pathologic voices, extended to running speech.<br />

Here only the second thematic field, which was carried out as a cooperative project<br />

of Prof. Eberhard Kruse (Department of Phoniatrics <strong>and</strong> Paedaudiology, University<br />

of Göttingen) <strong>and</strong> our group, will be described in more detail.<br />

2 Work related to speech recognition<br />

This research concerned, on one h<strong>and</strong>, methods appropriate for preprocessing in<br />

speech recognition, such as novel approaches to noise reduction, employing filtering<br />

in the modulation-frequency domain [1,2], <strong>and</strong> to speaker normalization, starting<br />

from acoustic estimation of speaker-specific measures of the vocal tract [3]. On the<br />

other h<strong>and</strong>, there were recent investigations concerning speech recognition itself: first,<br />

Hidden Markov Model (HMM) based recognition for “endless” signals with continuous<br />

forming of hypothesis graphs [4,5] <strong>and</strong> noise-robust speech/nonspeech distinction<br />

based on modulation filtering [6] (partially carried out at DaimlerChrysler); second,<br />

exploitation of prosodic features (measures of pitch <strong>and</strong> loudness) to improve semantic<br />

recognition in the context of a natural speech dialog platform [7,8] (partially done<br />

at Bosch GmbH).

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!