Teza doctorat (pdf) - Universitatea Tehnică
Teza doctorat (pdf) - Universitatea Tehnică Teza doctorat (pdf) - Universitatea Tehnică
O. Buza, G. Toderean, J. Domokos, A. Zs. Bodo, Construction of a Syllable-Based Text-To-Speech System for Romanian threshold U: A segment is assumed unvoiced if distance between two adjacent zeros is smaller than a Di U , i = s,… , s+n (8) Transient segments are also defined and they consist of regions for which conditions (6), (7) and (8) are not accomplished. In these rel ations, V and U thresholds have been chosen according to statistical median frequency for vowels and fricative consonants. 3.2. COMPACTING REGIONS After first appliance of above algorithm, a large set of regions will be created. Since voiced regions are well determined, the unvoiced are broken by intercalated silence regions. This situation appears because unvoiced consonants have low amplitude so they can break in many silence/unvoiced subregions. Transient segments can also appear inside the unvoiced segment because of signal bouncing above zero line. Figure 4 shows such an example, in which numbered regions are unvoiced, simple-line and unnumbered are silence regions, and double-line are transient regions. All these regions will be packed together in the second pass of the algorithm, so the result will be a single unvoiced region – as one can see in figure no. 5. After segmentation, voiced and unvoiced segments are coupled according to the syllable chain that is used in vocal database construction process. Appropriate acoustic units will be detected, labeled and stored in vocal database. Figure 4. Determining regions for an unvoiced segment of speech Figure 5. Compacting regions of above segment 306
O. Buza, G. Toderean, J. Domokos, A. Zs. Bodo, Construction of a Syllable-Based Text-To-Speech System for Romanian 3.3 REGIONS CLASSIFICATION The SUV segmentation process presented above divides the signal in four basic categories: Silence, Voiced, Unvoiced and Transition. Each category will be further classified in distinct types of regions, totally 10 classes: silence, unvoiced-silence, voiced, voiced glide, voiced plosive, voiced jump, transition, irregular (rugged), high density and unvoiced consonant (figure 6). The aim of this classification is to associate Romanian phonemes with signal regions having some particular traits. CLASS Figure 6. The four categories and ten classes of regions These ten classes of regions are shortly presented as follows: 1. Silence Region (S) Represents a region without speech, where signal amplitude is very low. 2. Unvoiced Silence (US) This region is a combination of silence S and unvoiced consonant C region. Detecting of this region as separate class was necessary because of fricative consonants that can be uttered at low amplitude, and so they could be found in these US regions. 3. Voiced Region (V) The voiced region contains all vowels from Romanian: /A/, /E/, /I/, /O/, /U/, /Ǎ/, /Î/, glide /L/, nasals /M/, /N/, and some voiced plosive consonants as /P/, /B/, /D/. 4. Voiced Glide (VG) This is a region corresponding to a voiced discontinuity and is associated with a minimum of energy. This situation may occur when a glide consonant like /R/ splits a sequence of vowels. CATEGORY 5. Voiced Plosive (VP) Region S V T U S US V VG VP VJ T IR HD C 307
- Page 274 and 275: Baza de date vocală Cap. 7. Proiec
- Page 276 and 277: 1 Procesare Separator Procesare Cuv
- Page 278 and 279: Cap. 7. Proiectarea sistemului de s
- Page 280 and 281: Cap. 7. Proiectarea sistemului de s
- Page 282 and 283: Cap. 7. Proiectarea sistemului de s
- Page 284 and 285: 8. Concluzii finale Cercetările ef
- Page 286 and 287: 268 Cap. 8. Concluzii finale 11. A
- Page 288 and 289: 270 Cap. 8. Concluzii finale percep
- Page 290 and 291: 272 Cap. 8. Concluzii finale b) pen
- Page 292 and 293: 274 Cap. 8. Concluzii finale - nive
- Page 294 and 295: Bibliografie [And88] André-Obrecht
- Page 296 and 297: 278 Bibliografie Quality and Testin
- Page 298 and 299: 280 Bibliografie [Giu06] Giurgiu M.
- Page 300 and 301: 282 Bibliografie [Nag05] Nageshwara
- Page 302 and 303: 284 Bibliografie Research Institute
- Page 304 and 305: Anexa 2. Silabele din setul S2 dup
- Page 306 and 307: Anexa 2. Silabele din setul S2 dup
- Page 308 and 309: Anexa 3. Silabe din setul S3 după
- Page 310 and 311: Anexa 4. Silabe din setul S4 după
- Page 312 and 313: Anexa 4. Silabe din setul S4 după
- Page 314 and 315: 296 Anexa 5. Activitatea ştiinţif
- Page 316 and 317: 298 Anexa 5. Activitatea ştiinţif
- Page 318 and 319: Anexa 6. Lucrări ştiinţifice ale
- Page 320 and 321: O. Buza, G. Toderean, J. Domokos, A
- Page 322 and 323: O. Buza, G. Toderean, J. Domokos, A
- Page 326 and 327: O. Buza, G. Toderean, J. Domokos, A
- Page 328 and 329: O. Buza, G. Toderean, J. Domokos, A
- Page 330 and 331: Level 3 O. Buza, G. Toderean, J. Do
- Page 332 and 333: O. Buza, G. Toderean, J. Domokos, A
- Page 334 and 335: O. Buza, G. Toderean, J. Domokos, A
- Page 336 and 337: O. Buza, G. Toderean, J. Domokos, A
- Page 338 and 339: About Construction of a Syllable-Ba
- Page 340 and 341: WSEAS TRANSACTIONS ON COMMUNICATION
- Page 342 and 343: WSEAS TRANSACTIONS ON COMMUNICATION
- Page 344 and 345: WSEAS TRANSACTIONS ON COMMUNICATION
O. Buza, G. Toderean, J. Domokos, A. Zs. Bodo, Construction of a Syllable-Based Text-To-Speech System for Romanian<br />
3.3 REGIONS CLASSIFICATION<br />
The SUV segmentation process presented above divides the signal in four basic categories:<br />
Silence, Voiced, Unvoiced and Transition. Each category will be further classified in distinct types<br />
of regions, totally 10 classes: silence, unvoiced-silence, voiced, voiced glide, voiced plosive,<br />
voiced jump, transition, irregular (rugged), high density and unvoiced consonant (figure 6). The<br />
aim of this classification is to associate Romanian phonemes with signal regions having some<br />
particular traits.<br />
CLASS<br />
Figure 6. The four categories and ten classes of regions<br />
These ten classes of regions are shortly presented as follows:<br />
1. Silence Region (S)<br />
Represents a region without speech, where signal amplitude is very low.<br />
2. Unvoiced Silence (US)<br />
This region is a combination of silence S and unvoiced consonant C region. Detecting of<br />
this region as separate class was necessary because of fricative consonants that can be<br />
uttered at low amplitude, and so they could be found in these US regions.<br />
3. Voiced Region (V)<br />
The voiced region contains all vowels from Romanian: /A/, /E/, /I/, /O/, /U/, /Ǎ/, /Î/, glide<br />
/L/, nasals /M/, /N/, and some voiced plosive consonants as /P/, /B/, /D/.<br />
4. Voiced Glide (VG)<br />
This is a region corresponding to a voiced discontinuity and is associated with a minimum<br />
of energy. This situation may occur when a glide consonant like /R/ splits a sequence of<br />
vowels.<br />
CATEGORY<br />
5. Voiced Plosive (VP)<br />
Region<br />
S V T U<br />
S US V VG VP VJ T IR HD C<br />
307