The following information was submitted:
Transactions: WSEAS TRANSACTIONS ON SIGNAL PROCESSING
Transactions ID Number: 27-724
Full Name: Andrej Zgank
Position: Doctor (Researcher)
Age: ON
Sex: Male
Address: Smetanova ul. 17, 2000 Maribor
Country: SLOVENIA
Tel: +386 2 220 7206
Tel prefix: /
Fax: +386 2 220 7272
E-mail address: andrej.zgank@uni-mb.si
Other E-mails:
Title of the Paper: Slovenian Spontaneous Speech Recognition and Acoustic Modeling of Filled Pauses and Onomatopoeas
Authors as they appear in the Paper: Andrej Zgank, Tomaz Rotovnik, Mirjam Sepesy Maucec
Email addresses of all the authors: andrej.zgank@uni-mb.si,tomaz.rotovnik@uni-mb.si,mirjam.sepesy@uni-mb.si
Number of paper pages: 10
Abstract: This paper is focused on acoustic modeling for spontaneous speech recognition. This topic is still a very challenging task for speech technology research community. The attributes of spontaneous speech can heavily degrade speech recognizer's accuracy and performance. Filled pauses and onomatopoeias present one of such important attributes of spontaneous speech, which can give considerably worse accuracy. Although filled pauses don't carry any semantic information, they are still very important from the modeling perspective. A novel acoustic modeling approach is proposed in this paper, where the filled pauses are modeled using the phonetic broad classes, which corresponds with their acoustic-phonetic properties. The phonetic broad classes are language dependent, and can be defined by an expert or in a data-driven way. The new filled pauses modeling approach is compared with three other implicit filled pauses modeling methods. All experiments were carried out using a!
context-dependent Hidden Markov Models based speech recognition system. For training and evaluation, the Slovenian BNSI Broadcast News speech and text database was applied. The database contains manually transcribed recordings of TV news shows. The evaluation of the proposed acoustic modeling approach was done on a set of spontaneous speech. The overall best filled pauses acoustic modeling approach improved the speech recognizer's word accuracy for 5.70% relatively in comparison to the baseline system, without influencing the recognition time.
Keywords: Speech recognition, Acoustic modeling, Filled pauses, Onomatopoeas, Slovenian spontaneous speech, Broadcast news, HMM
EXTENSION of the file: .pdf
Special (Invited) Session: Modeling Filled Pauses for Spontaneous Speech Recognition Applications
Organizer of the Session: 590-092
How Did you learn about congress:
IP ADDRESS: 164.8.22.63