Preview

Doklady BGUIR

Advanced search

ARCHITECTURE OF THE MULTIVOICE TEXT-TO-SPEECH SYSTEM

Abstract

Architecture of the multimodal text to speech synthesis system based on the voice conversion framework was proposed. Such system could be tuned to the specific speaker without any costs losses on the training phase and based on one speaker base, having in TTS system. Structural scheme for this type of the speech synthesizer, with the description of the functionality of the main blocks were presented. Their specific characteristics are synergy approach to the architecture and text-independent mode in the training phase.

About the Authors

V. A. Zakharyeu
Белорусский государственный университет информатики и радиоэлектроники
Belarus


A. A. Petrovsky
Белорусский государственный университет информатики и радиоэлектроники
Belarus


References

1. Лобанов Б.М. Компьютерный синтез и клонирование речи. Минск, 2008.

2. Sundermann D. // ICASSP. 2006. P. 81-84.

3. Duxans B. // PUC. 2006. P. 171-175.

4. Анализаторы речевых и звуковых сигналов: методы, алгоритмы и практика. // Под ред. А.А. Петровского. Минск, 2009

5. Bourlard H. Introduction to Hidden Markov Models. Lauseane, 2010.

6. Stylianou Y. // Springer. 2007. P. 502-532.

7. Захарьев В.А, Петровский А.А. // Докл. БГУИР. 2013. № 1 (71). C. 39-45.


Review

For citations:


Zakharyeu V.A., Petrovsky A.A. ARCHITECTURE OF THE MULTIVOICE TEXT-TO-SPEECH SYSTEM. Doklady BGUIR. 2013;(7):57-63. (In Russ.)

Views: 323


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1729-7648 (Print)
ISSN 2708-0382 (Online)