Predicting the location of phrase breaks within an utterance is an important task in text-to-speech synthesis, and can be done with reasonable accuracy using part-of-speech (POS) tags as features. However, it seems unlikely that the 40 or more different tags used by most taggers all contribute to this task, and in fact many may contribute noise. In this paper, we present an algorithm for reducing the standard Penn Treebank POS tag set for use in predicting phrase breaks. Using the best first search approach, the algorithm considers possible groupings of tags, searching the groupings that yield the highest overall performance. The reduced tag sets were evaluated by an n-gram model trained on POS sequences along with their associated juncture (break/non-break), the reduced tag set raised the model's performance on junctures correct from 90.38% to 92.43%, and reduced insertions from 2.89% to 1.83%.
|Number of pages||4|
|Publication status||Published - Oct 2004|
|Event||8th International Conference on Spoken Language Processing (Interspeech 2004) - Jeju Island, South Korea|
Duration: 4 Oct 2004 → 8 Oct 2004
|Conference||8th International Conference on Spoken Language Processing (Interspeech 2004)|
|Period||4/10/04 → 8/10/04|