Tuesday afternoon, 11 November - The Acoustical Society of America
Tuesday afternoon, 11 November - The Acoustical Society of America
Tuesday afternoon, 11 November - The Acoustical Society of America
Create successful ePaper yourself
Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.
aries on a transcript. Intertranscriber agreement rates across subsets <strong>of</strong><br />
17–40 subjects are significantly above chance based on Fleiss’ statistic, indicating<br />
that listeners’ perception <strong>of</strong> prosody is reliable, with higher agreement<br />
rates for boundary perception than for prominence. Prosody perception<br />
varies across listeners both corpora and across speakers WMD, where<br />
perceived prosody varies for the same utterance produced by different<br />
speakers. Acoustic measures from stressed vowels Buckeye: duration, intensity,<br />
F1, F2 and articulatory kinematic measures WMD are correlated<br />
with the perceived prosodic features <strong>of</strong> the word. Work supported by NSF.<br />
2pSC15. Perception <strong>of</strong> contrastive meaning through the LH * LH%<br />
contour. Heeyeon Y. Dennison, Amy J. Schafer, and Victoria B. Anderson<br />
Dept. <strong>of</strong> Linguist., Univ. <strong>of</strong> Hawaii, 1890 East-West Rd., Honolulu, HI<br />
96822, linguist@hawaii.edu<br />
This study establishes empirical evidence regarding listeners’s perceptions<br />
<strong>of</strong> the contrastive tune LH * LH%; e.g., Lee et al. 2007. Eighteen<br />
native English speakers heard three types <strong>of</strong> test sentences: 1 contrastive,<br />
“<strong>The</strong> mailbox wasLH * fullLH%,” 2 positive neutral, “<strong>The</strong><br />
mailboxH * was fullH * LL%;” and 3 negated neutral, “<strong>The</strong><br />
mailboxH * was notH * fullH * LL%.” <strong>The</strong> participants first scored<br />
them by naturalness, and then typed continuation sentences based on the<br />
perceived meaning. Three other native English speakers independently<br />
coded the continuations to evaluate participants’ interpretations <strong>of</strong> the test<br />
sentences. <strong>The</strong> results clearly demonstrated that the LH * LH% tune generated<br />
contrastive meanings e.g., “…but the mailman took the mail and<br />
now it is empty” significantly more <strong>of</strong>ten than both the positive and negative<br />
neutral counterparts. Moreover, sentences presented in the contrastive tune<br />
were perceived as natural utterances. High coder agreement indicated a reliable<br />
function <strong>of</strong> the contrastive tune, conforming to the existing literature<br />
based on intuitive examples e.g., Lee 1999. Interestingly, however, the<br />
contrastive tune produced the expected contrastive meaning in only about<br />
60% <strong>of</strong> trials versus less than 10% contrastive continuations for the other<br />
contours. This finding shows that the interpretation <strong>of</strong> the LH * LH%<br />
contour is more complex than previously suggested.<br />
2pSC16. Order <strong>of</strong> presentation asymmetry in intonational contour<br />
discrimination in English. Hyekyung Hwang Dept. <strong>of</strong> Linguist., McGill<br />
Univ., 1085 Dr. Penfield Ave., Montreal PQ H3A 1A7, Canada,<br />
hye.hwang@mail.mcgill.ca, Amy J. Schafer, and Victoria B. Anderson<br />
Univ. <strong>of</strong> Hawaii, Honolulu, HI 96822<br />
In the work <strong>of</strong> Hwang et al. 2007, native English speakers showed<br />
overall poor accuracy in distinguishing initially rising versus level e.g.,<br />
L * L * H- H * L-L% vs L * L * L- H * L-L% or initially falling versus level e.g.,<br />
H * H * L- H * L-L% vs H * H * H- H * L-L% contour contrasts on English<br />
phrases in an AX discrimination task. Results not reported in that paper<br />
found that it was easier to discriminate when a more complex F0 contour<br />
occurred second than when it occurred first. Several orders <strong>of</strong> presentation<br />
effects in the perception <strong>of</strong> intonation have been reported e.g., L. Morton<br />
1997; S. Lintfert 2003; Cummins et al. 2006 but no satisfying account<br />
has been provided. This study investigated these asymmetries more<br />
systematically. <strong>The</strong> order effect was significant for falling-level contrast<br />
pairs: pairs with a more complex F0 contour last were discriminated more<br />
easily than the reverse order. Rising versus level contrasts showed a similar<br />
tendency. <strong>The</strong> results thus extend intonational discrimination asymmetries to<br />
these additional contours. <strong>The</strong>y suggest that the cause <strong>of</strong> the asymmetries<br />
may depend more on F0 complexity than on F0 peak.<br />
2pSC17. Alternatives to f0 turning points in <strong>America</strong>n English<br />
intonation. Jonathan Barnes Dept. <strong>of</strong> Romance Studies, Boston Univ., 621<br />
Commonwealth Ave., Boston, MA 02215, jabarnes@bu.edu, Nanette<br />
Veilleux Dept <strong>of</strong> Comput. Sci., Simmons College, Boston, MA 02<strong>11</strong>5,<br />
veilleux@simmons.edu, Alejna Brugos Boston Univ., Boston, MA 02215,<br />
abrugos@bu.edu, and Stefanie Shattuck-Hufnagel Res. Lab <strong>of</strong> Electrons,<br />
MIT, Cambridge, MA 02139, stef@speech.mit.edu<br />
Since the inception <strong>of</strong> the autosegmental-metrical approach to intonation<br />
Bruce 1977, Pierrehumbert 1980, Ladd 1996, the location and scaling <strong>of</strong> f0<br />
turning points have been used to characterize phonologically distinct f0 contours<br />
in various languages, including <strong>America</strong>n English. This approach is<br />
undermined, however, by the difficulty listeners experience in perceiving<br />
differences in turning point location. Numerous studies have demonstrated<br />
either listener insensitivity to changes in turning point location or the capacity<br />
for other aspects <strong>of</strong> contour “shape” to override turning-point alignment<br />
for contour identification Chen 2003, D’Imperio 2000, Niebuhr 2008.<br />
Even labelers with access to visual representations <strong>of</strong> the f0 encounter similar<br />
challenges. By contrast, a family <strong>of</strong> related measurements using area under<br />
the f0 curve to quantify differences in contour shape appear more robust.<br />
For example, a measure <strong>of</strong> the synchronization <strong>of</strong> the center <strong>of</strong> gravity <strong>of</strong> the<br />
accentual rise with the boundaries <strong>of</strong> the accented vowel yields 93.9% correct<br />
classification in a logistic regression model on a data set <strong>of</strong> <strong>11</strong>5 labeled<br />
utterances differing in pitch accent type. L * H LH * in ToBI<br />
terminology. This classification proceeds entirely without explicit reference<br />
to the turning points i.e., beginning <strong>of</strong> rise, peak traditionally used to characterize<br />
this distinction.<br />
2pSC18. Comparison <strong>of</strong> a child’s fundamental frequencies during<br />
structured and unstructured activities: A case study. Eric Hunter Natl.<br />
Ctr. for Voice and Speech, <strong>11</strong>01 13th St., Denver, CO 80126, eric.hunter<br />
@ncvs2.org<br />
This case study investigates the difference between children’s fundamental<br />
frequency F 0 during structured and unstructured activities, building on<br />
the concept that task type influences F 0 values. A healthy male child 67<br />
months was evaluated 31 h, 4 days. During all activities, a National Center<br />
for Voice and Speech voice dosimeter was worn to measure long-term<br />
unstructured vocal usage. Four structured tasks from previous F 0 studies<br />
were also completed: 1 sustaining the vowel /Ä/, 2 sustaining the vowel<br />
/Ä/ embedded in a phrase-end word, 3 repeating a sentence, and 4 counting<br />
from one to ten. Mean F 0 during vocal tasks 257 Hz, as measured by<br />
the dosimeter and acoustic analysis <strong>of</strong> microphone data, matched the literature’s<br />
average results for the child’s age. However, the child’s mean F 0 during<br />
unstructured activities was significantly higher 376 Hz. <strong>The</strong> mode and<br />
median <strong>of</strong> the vocal tasks were respectively 260 and 259 Hz, while the dosimeter’s<br />
mode and median were 290 and 355 Hz. Results suggest that children<br />
produce significantly different voice patterns during clinical observations<br />
than in routine activities. Further, long-term F 0 distribution is not<br />
normal, making statistical mean an invalid measure for such. F 0 mode and<br />
median are suggested as two replacement parameters to convey basic information<br />
about F 0 usage.<br />
2pSC19. Effects <strong>of</strong> acoustic cue manipulations on emotional prosody<br />
recognition. Chinar Dara and Marc Pell School <strong>of</strong> Commun. Sci. and<br />
Disord., McGill Univ., 1266 Pine West, Montreal, QC H3G 1A8, Canada,<br />
chinar.dara@mail.mcgill.ca<br />
Studies on emotion recognition from prosody have largely focused on<br />
the role and effectiveness <strong>of</strong> isolated acoustic parameters and less is known<br />
about how information from these cues is perceived and combined to infer<br />
emotional meaning. To better understand how acoustic cues influence recognition<br />
<strong>of</strong> discrete emotions from voice, this study investigated how listeners<br />
perceptually combine information from two critical acoustic cues, pitch<br />
and speech rate, to identify emotions. For all the utterances, pitch and<br />
speech rate measures <strong>of</strong> the whole utterance were independently manipulated<br />
by factors <strong>of</strong> 1.25 25% and 0.75 25%. To examine the influence<br />
<strong>of</strong> one cue with reference to the other cue the three manipulations <strong>of</strong> pitch<br />
25%, 0%, and 25% were crossed with the three manipulations <strong>of</strong><br />
speech rate 25%, 0%, and 25%. Pseudoutterances spoken in five emotional<br />
tones happy, sad, angry, fear, and disgust and neutral that have undergone<br />
acoustic cue manipulations were presented to 15 male and 15 female<br />
participants for an emotion identification task. Results indicated that<br />
both pitch and speech rate are important acoustic parameters to identify<br />
emotions and more critically, it is the relative weight <strong>of</strong> each cue which<br />
seems to contribute significantly for categorizing happy, sad, fear, and<br />
neutral.<br />
2pSC20. Perception <strong>of</strong> emphasis in urban Jordanian Arabic. Allard<br />
Jongman, Sonja Combest, Wendy Herd, and Mohammad Al-Masri<br />
Linguist. Dept., Univ. <strong>of</strong> Kansas, 1541 Lilac Ln., Lawrence, KS 66044,<br />
jongman@ku.edu<br />
Previous acoustic analyses <strong>of</strong> minimal pairs <strong>of</strong> emphatic versus plain<br />
CVC stimuli showed that 1 emphatic consonants have a lower spectral<br />
mean than their plain counterparts and 2 vowels surrounding emphatic<br />
consonants are characterized by a higher F1, lower F2, and higher F3 than<br />
2497 J. Acoust. Soc. Am., Vol. 124, No. 4, Pt. 2, October 2008 156th Meeting: <strong>Acoustical</strong> <strong>Society</strong> <strong>of</strong> <strong>America</strong><br />
2497<br />
2p TUE. PM