Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech

Kolář, Jáchym; Liu, Yang

Full metadata record

DC pole	Hodnota	Jazyk
dc.contributor.author	Kolář, Jáchym
dc.contributor.author	Liu, Yang
dc.date.accessioned	2016-01-08T06:54:22Z
dc.date.available	2016-01-08T06:54:22Z
dc.date.issued	2010
dc.identifier.citation	KOLÁŘ, Jáchym; LIU, Yang. Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech. In: Acoustics, Speech and Signal Processing, 2010. ICASSP ´10, 14-19 March 2010 Dallas, Texas, USA. Beijing: IEEE Press, 2010, p. 5258 - 5261. ISBN 978-1-4244-4296-6.	en
dc.identifier.isbn	978-1-4244-4296-6
dc.identifier.uri	http://www.kky.zcu.cz/cs/publications/JachymKolar_2010_AutomaticSentence
dc.identifier.uri	http://hdl.handle.net/11025/17174
dc.format	4 s.	cs
dc.format.mimetype	application/pdf
dc.language.iso	en	en
dc.publisher	IEEE Press	en
dc.rights	© Jáchym Kolář - Yang Liu	cs
dc.subject	porozumění mluvené řeči	cs
dc.subject	detekce hranice věty	cs
dc.subject	prozodie	cs
dc.subject	mechanické učení	cs
dc.title	Automatic sentence boundary detection in conversational speech: a cross-lingual evaluation on english and czech	en
dc.type	článek	cs
dc.type	article	en
dc.rights.access	openAccess	en
dc.type.version	publishedVersion	en
dc.description.abstract-translated	Automatic sentence segmentation of speech is important for enriching speech recognition output and aiding downstream language processing. This paper focuses on automatic sentence segmentation of speech in two different languages -- English and Czech. For this task, we compare and combine three statistical models -- HMM, maximum entropy, and a boosting-based model BoosTexter. All these approaches rely on both textual and prosodic information. We evaluate these methods on a corpus of multiparty meetings in English, and on a corpus of broadcast conversations in Czech, using both manual and speech recognition transcripts. The experiments show that superior results are achieved when all the three models are combined via posterior probability interpolation. We observe differences in terms of model performance between English and Czech, as well as the feature usage difference in prosodic models between the two languages. Overall, the analysis is important for porting sentence segmentation approaches from one language to another.	en
dc.subject.translated	spoken language understanding	en
dc.subject.translated	sentence boundary detection	en
dc.subject.translated	prosody	en
dc.subject.translated	machine learning	en
dc.type.status	Peer-reviewed	en
Vyskytuje se v kolekcích:	Články / Articles (KKY)

Soubory připojené k záznamu:

Soubor	Popis	Velikost	Formát
JachymKolar_2010_AutomaticSentence.pdf	Plný text	59,03 kB	Adobe PDF	Zobrazit/otevřít

Zobrazit minimální záznam Zobrazit statistiky

Použijte tento identifikátor k citaci nebo jako odkaz na tento záznam: http://hdl.handle.net/11025/17174

Všechny záznamy v DSpace jsou chráněny autorskými právy, všechna práva vyhrazena.

hledání

navigace