Stochastic and syntactic techniques for predicting phrase breaks

Ian Read, Stephen J. Cox

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)

Abstract

Determining the position of breaks in a sentence is a key task for a text-to-speech system. A synthesized sentence containing incorrect breaks at best requires increased listening effort, and at worst, may have lower intelligibility and different semantics from a correctly phrased sentence. In addition, the position of breaks must be known before other components of the sentence’s prosodic structure can be determined. We consider here some methods for phrase break prediction in which the whole sentence is analysed, in contrast to most previous work which has focused on analysing an area around an individual juncture. One of the main features we use is part-of-speech tags. First, we report an algorithm that reduces the number of tags in the tagset whilst improving break prediction accuracy. We then describe three approaches to break prediction: by analogy, in which we find the best-matching sentence in our training data to the unseen sentence; by phrase modelling, in which we build stochastic models of phrases and use these, together with a “phrase grammar”, to segment the unseen sentence; and finally, using features derived from a syntactic parse of the sentence. All techniques achieve well above our baseline performance, which used punctuation symbols to determine break positions, and performance increased with each successive technique. Our best result, obtained on the MARSEC corpus and using a combination of parse tree derived features and a local feature, gave an F score of 81.6%, which we believe to be the highest published on this dataset.
Original languageEnglish
Pages (from-to)519-542
Number of pages24
JournalComputer Speech and Language
Volume21
Issue number3
DOIs
Publication statusPublished - 2007

Cite this