15.06.2013 Views

Teza doctorat (pdf) - Universitatea Tehnică

Teza doctorat (pdf) - Universitatea Tehnică

Teza doctorat (pdf) - Universitatea Tehnică

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

WSEAS TRANSACTIONS ON COMMUNICATIONS Ovidiu Buza, Gavril Toderean<br />

A segment is assumed unvoiced if distance Di<br />

between two adjacent zeros is smaller than a<br />

threshold U:<br />

Di U , i = s,… , s+n (4)<br />

Transient segments are also defined and they<br />

consist of regions for which conditions (2), (3)<br />

and (4) are not accomplished.<br />

After first appliance of above algorithm, a<br />

large set of regions will be created. Since voiced<br />

regions are well determined, the unvoiced are<br />

broken by intercalated silence regions. This<br />

situation appears because unvoiced consonants<br />

325<br />

have low amplitude so they can break in many<br />

silence/unvoiced subregions.<br />

Transient segments can also appear inside the<br />

unvoiced segment because of signal bouncing<br />

above zero line.<br />

Figure 7 shows such an example, in which<br />

numbered regions are unvoiced, simple-line and<br />

unnumbered are silence regions, and double-line<br />

are transient regions.<br />

All these regions will be packed together in the<br />

second pass of the algorithm, so the result will be<br />

a single unvoiced region – as one can see in figure<br />

no. 8.<br />

Fig.7. Determining regions for an unvoiced segment of speech<br />

After segmentation, voiced and unvoiced<br />

segments are coupled according to the syllable<br />

chain that is used in vocal database construction<br />

process. Acoustic units are labelled and stored in<br />

database. Each region boundary can be viewed<br />

with a special application and, if necessary, can be<br />

adjusted.<br />

7 Vocal Database Construction<br />

Vocal database includes a subset of Romanian<br />

language syllables. Acoustic units were separated<br />

from male speech and normalized in pitch and<br />

amplitude.<br />

Vocal database with recorded syllables has a<br />

tree data structure. Each node in the tree<br />

corresponds with a syllable characteristic, and a<br />

leaf represents appropriate syllable.<br />

Fig.8. Compacting regions of above segment<br />

Units have been inserted in database following<br />

this classification:<br />

- after length of syllables : we have two, three or<br />

four character syllables (denoted S2, S3 and S4)<br />

and also singular phonemes;<br />

- after position inside the word: initial or median<br />

(Med) and final syllables (Fin);<br />

- after accentuation: stressed or accentuated (A) or<br />

normal (N) syllables.<br />

This classification offers the advantage of<br />

reducing time for matching process between<br />

phonetic and acoustic units.<br />

Organization of vocal database is shown in<br />

figure no. 9. Level one nodes indicate length of<br />

syllables, level two nodes indicate median or final<br />

syllables, and level three accentuated or normal<br />

syllables.

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!