Skip to main navigation Skip to search Skip to main content

Modeling broad context for tone recognition with Conditional Random Fields

  • University of Washington

Research output: Contribution to journalConference articlepeer-review

1 Scopus citations

Abstract

We propose a tone recognition approach that employs linear-chain Conditional Random Fields (CRF) to model tone variation due to intonation effects. We implement three linear-chain CRFs which aim at modeling intonation effects at phrasesentence-and story-level boundaries, where we show that standard recognition techniques degrade and common normalization approaches do not improve. We show that all linear-chain CRFs outperform the baseline unigram model, and the biggest improvement is found in recognizing 3rd tones, (4%) in overall accuracy. In particular, Phrase Bigram CRFs show a drastic 39% improvement in recognizing 3rd tones located at initial boundaries. This improvement shows that the position specific modeling of initial tones in bigram CRFs captures the intonation effects better than the baseline unigram model.

Original languageEnglish
Pages (from-to)2289-2292
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2011
Event12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011 - Florence, Italy
Duration: Aug 27 2011Aug 31 2011

Keywords

  • Broad context
  • Conditional random fields
  • Prosody
  • Tone recognition

Fingerprint

Dive into the research topics of 'Modeling broad context for tone recognition with Conditional Random Fields'. Together they form a unique fingerprint.

Cite this