05.03.2013 Views

Tuesday afternoon, 11 November - The Acoustical Society of America

Tuesday afternoon, 11 November - The Acoustical Society of America

Tuesday afternoon, 11 November - The Acoustical Society of America

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

aries on a transcript. Intertranscriber agreement rates across subsets <strong>of</strong><br />

17–40 subjects are significantly above chance based on Fleiss’ statistic, indicating<br />

that listeners’ perception <strong>of</strong> prosody is reliable, with higher agreement<br />

rates for boundary perception than for prominence. Prosody perception<br />

varies across listeners both corpora and across speakers WMD, where<br />

perceived prosody varies for the same utterance produced by different<br />

speakers. Acoustic measures from stressed vowels Buckeye: duration, intensity,<br />

F1, F2 and articulatory kinematic measures WMD are correlated<br />

with the perceived prosodic features <strong>of</strong> the word. Work supported by NSF.<br />

2pSC15. Perception <strong>of</strong> contrastive meaning through the LH * LH%<br />

contour. Heeyeon Y. Dennison, Amy J. Schafer, and Victoria B. Anderson<br />

Dept. <strong>of</strong> Linguist., Univ. <strong>of</strong> Hawaii, 1890 East-West Rd., Honolulu, HI<br />

96822, linguist@hawaii.edu<br />

This study establishes empirical evidence regarding listeners’s perceptions<br />

<strong>of</strong> the contrastive tune LH * LH%; e.g., Lee et al. 2007. Eighteen<br />

native English speakers heard three types <strong>of</strong> test sentences: 1 contrastive,<br />

“<strong>The</strong> mailbox wasLH * fullLH%,” 2 positive neutral, “<strong>The</strong><br />

mailboxH * was fullH * LL%;” and 3 negated neutral, “<strong>The</strong><br />

mailboxH * was notH * fullH * LL%.” <strong>The</strong> participants first scored<br />

them by naturalness, and then typed continuation sentences based on the<br />

perceived meaning. Three other native English speakers independently<br />

coded the continuations to evaluate participants’ interpretations <strong>of</strong> the test<br />

sentences. <strong>The</strong> results clearly demonstrated that the LH * LH% tune generated<br />

contrastive meanings e.g., “…but the mailman took the mail and<br />

now it is empty” significantly more <strong>of</strong>ten than both the positive and negative<br />

neutral counterparts. Moreover, sentences presented in the contrastive tune<br />

were perceived as natural utterances. High coder agreement indicated a reliable<br />

function <strong>of</strong> the contrastive tune, conforming to the existing literature<br />

based on intuitive examples e.g., Lee 1999. Interestingly, however, the<br />

contrastive tune produced the expected contrastive meaning in only about<br />

60% <strong>of</strong> trials versus less than 10% contrastive continuations for the other<br />

contours. This finding shows that the interpretation <strong>of</strong> the LH * LH%<br />

contour is more complex than previously suggested.<br />

2pSC16. Order <strong>of</strong> presentation asymmetry in intonational contour<br />

discrimination in English. Hyekyung Hwang Dept. <strong>of</strong> Linguist., McGill<br />

Univ., 1085 Dr. Penfield Ave., Montreal PQ H3A 1A7, Canada,<br />

hye.hwang@mail.mcgill.ca, Amy J. Schafer, and Victoria B. Anderson<br />

Univ. <strong>of</strong> Hawaii, Honolulu, HI 96822<br />

In the work <strong>of</strong> Hwang et al. 2007, native English speakers showed<br />

overall poor accuracy in distinguishing initially rising versus level e.g.,<br />

L * L * H- H * L-L% vs L * L * L- H * L-L% or initially falling versus level e.g.,<br />

H * H * L- H * L-L% vs H * H * H- H * L-L% contour contrasts on English<br />

phrases in an AX discrimination task. Results not reported in that paper<br />

found that it was easier to discriminate when a more complex F0 contour<br />

occurred second than when it occurred first. Several orders <strong>of</strong> presentation<br />

effects in the perception <strong>of</strong> intonation have been reported e.g., L. Morton<br />

1997; S. Lintfert 2003; Cummins et al. 2006 but no satisfying account<br />

has been provided. This study investigated these asymmetries more<br />

systematically. <strong>The</strong> order effect was significant for falling-level contrast<br />

pairs: pairs with a more complex F0 contour last were discriminated more<br />

easily than the reverse order. Rising versus level contrasts showed a similar<br />

tendency. <strong>The</strong> results thus extend intonational discrimination asymmetries to<br />

these additional contours. <strong>The</strong>y suggest that the cause <strong>of</strong> the asymmetries<br />

may depend more on F0 complexity than on F0 peak.<br />

2pSC17. Alternatives to f0 turning points in <strong>America</strong>n English<br />

intonation. Jonathan Barnes Dept. <strong>of</strong> Romance Studies, Boston Univ., 621<br />

Commonwealth Ave., Boston, MA 02215, jabarnes@bu.edu, Nanette<br />

Veilleux Dept <strong>of</strong> Comput. Sci., Simmons College, Boston, MA 02<strong>11</strong>5,<br />

veilleux@simmons.edu, Alejna Brugos Boston Univ., Boston, MA 02215,<br />

abrugos@bu.edu, and Stefanie Shattuck-Hufnagel Res. Lab <strong>of</strong> Electrons,<br />

MIT, Cambridge, MA 02139, stef@speech.mit.edu<br />

Since the inception <strong>of</strong> the autosegmental-metrical approach to intonation<br />

Bruce 1977, Pierrehumbert 1980, Ladd 1996, the location and scaling <strong>of</strong> f0<br />

turning points have been used to characterize phonologically distinct f0 contours<br />

in various languages, including <strong>America</strong>n English. This approach is<br />

undermined, however, by the difficulty listeners experience in perceiving<br />

differences in turning point location. Numerous studies have demonstrated<br />

either listener insensitivity to changes in turning point location or the capacity<br />

for other aspects <strong>of</strong> contour “shape” to override turning-point alignment<br />

for contour identification Chen 2003, D’Imperio 2000, Niebuhr 2008.<br />

Even labelers with access to visual representations <strong>of</strong> the f0 encounter similar<br />

challenges. By contrast, a family <strong>of</strong> related measurements using area under<br />

the f0 curve to quantify differences in contour shape appear more robust.<br />

For example, a measure <strong>of</strong> the synchronization <strong>of</strong> the center <strong>of</strong> gravity <strong>of</strong> the<br />

accentual rise with the boundaries <strong>of</strong> the accented vowel yields 93.9% correct<br />

classification in a logistic regression model on a data set <strong>of</strong> <strong>11</strong>5 labeled<br />

utterances differing in pitch accent type. L * H LH * in ToBI<br />

terminology. This classification proceeds entirely without explicit reference<br />

to the turning points i.e., beginning <strong>of</strong> rise, peak traditionally used to characterize<br />

this distinction.<br />

2pSC18. Comparison <strong>of</strong> a child’s fundamental frequencies during<br />

structured and unstructured activities: A case study. Eric Hunter Natl.<br />

Ctr. for Voice and Speech, <strong>11</strong>01 13th St., Denver, CO 80126, eric.hunter<br />

@ncvs2.org<br />

This case study investigates the difference between children’s fundamental<br />

frequency F 0 during structured and unstructured activities, building on<br />

the concept that task type influences F 0 values. A healthy male child 67<br />

months was evaluated 31 h, 4 days. During all activities, a National Center<br />

for Voice and Speech voice dosimeter was worn to measure long-term<br />

unstructured vocal usage. Four structured tasks from previous F 0 studies<br />

were also completed: 1 sustaining the vowel /Ä/, 2 sustaining the vowel<br />

/Ä/ embedded in a phrase-end word, 3 repeating a sentence, and 4 counting<br />

from one to ten. Mean F 0 during vocal tasks 257 Hz, as measured by<br />

the dosimeter and acoustic analysis <strong>of</strong> microphone data, matched the literature’s<br />

average results for the child’s age. However, the child’s mean F 0 during<br />

unstructured activities was significantly higher 376 Hz. <strong>The</strong> mode and<br />

median <strong>of</strong> the vocal tasks were respectively 260 and 259 Hz, while the dosimeter’s<br />

mode and median were 290 and 355 Hz. Results suggest that children<br />

produce significantly different voice patterns during clinical observations<br />

than in routine activities. Further, long-term F 0 distribution is not<br />

normal, making statistical mean an invalid measure for such. F 0 mode and<br />

median are suggested as two replacement parameters to convey basic information<br />

about F 0 usage.<br />

2pSC19. Effects <strong>of</strong> acoustic cue manipulations on emotional prosody<br />

recognition. Chinar Dara and Marc Pell School <strong>of</strong> Commun. Sci. and<br />

Disord., McGill Univ., 1266 Pine West, Montreal, QC H3G 1A8, Canada,<br />

chinar.dara@mail.mcgill.ca<br />

Studies on emotion recognition from prosody have largely focused on<br />

the role and effectiveness <strong>of</strong> isolated acoustic parameters and less is known<br />

about how information from these cues is perceived and combined to infer<br />

emotional meaning. To better understand how acoustic cues influence recognition<br />

<strong>of</strong> discrete emotions from voice, this study investigated how listeners<br />

perceptually combine information from two critical acoustic cues, pitch<br />

and speech rate, to identify emotions. For all the utterances, pitch and<br />

speech rate measures <strong>of</strong> the whole utterance were independently manipulated<br />

by factors <strong>of</strong> 1.25 25% and 0.75 25%. To examine the influence<br />

<strong>of</strong> one cue with reference to the other cue the three manipulations <strong>of</strong> pitch<br />

25%, 0%, and 25% were crossed with the three manipulations <strong>of</strong><br />

speech rate 25%, 0%, and 25%. Pseudoutterances spoken in five emotional<br />

tones happy, sad, angry, fear, and disgust and neutral that have undergone<br />

acoustic cue manipulations were presented to 15 male and 15 female<br />

participants for an emotion identification task. Results indicated that<br />

both pitch and speech rate are important acoustic parameters to identify<br />

emotions and more critically, it is the relative weight <strong>of</strong> each cue which<br />

seems to contribute significantly for categorizing happy, sad, fear, and<br />

neutral.<br />

2pSC20. Perception <strong>of</strong> emphasis in urban Jordanian Arabic. Allard<br />

Jongman, Sonja Combest, Wendy Herd, and Mohammad Al-Masri<br />

Linguist. Dept., Univ. <strong>of</strong> Kansas, 1541 Lilac Ln., Lawrence, KS 66044,<br />

jongman@ku.edu<br />

Previous acoustic analyses <strong>of</strong> minimal pairs <strong>of</strong> emphatic versus plain<br />

CVC stimuli showed that 1 emphatic consonants have a lower spectral<br />

mean than their plain counterparts and 2 vowels surrounding emphatic<br />

consonants are characterized by a higher F1, lower F2, and higher F3 than<br />

2497 J. Acoust. Soc. Am., Vol. 124, No. 4, Pt. 2, October 2008 156th Meeting: <strong>Acoustical</strong> <strong>Society</strong> <strong>of</strong> <strong>America</strong><br />

2497<br />

2p TUE. PM

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!