CREATION OF A SYSTEM (PARADISE) FOR THE SYNTHESIS OF ARABIC SPEECH FROM THE TEXT
Main Article Content
Abstract
The study that we present in this paper concerns the realization of a text speech synthesis system for the Arabic language.
We will examine the general architecture of our PARADIS system that is based on the concatenation of di-syllables and TD-PSOLA as a synthesis method, the grapheme to phoneme transcription being based on the principle of compiled rules.
After having examined phonetic transcription problems, we present the interest of the choice of the di-syllable as a concatenation unit for the synthesizer and its contribution to improve the voice quality produced by the synthesizer. Indeed, di-syllables reduce temporal discontinuity problems during the conca-tenation. One of the fundamental components of our TTS system is TD-PSOLA synthesizer. It is known by its capacity of direct action on the speech signal and the concept of separation between the coding algorithm and the synthesis technique. TD-PSOLA has significantly improved the synthetic speech quality as it allows, with a great simplicity, the variation of the fundamental frequency of the synthesized speech signal.
Article Details
References
Benabbou A., Etude et génération de la mélodie pour le système TD-PSOLA, thèse pour l’obtention du titre de docteur en informatique, ENSIAS Rabat, Université Mohammed V, (2001).
Benabbou A, Chenfour N., Mouradi A., Study and Quantification of the Declination for the Arabic Speech Synthesis in System PARADIS, LREC-2002, (2002).
Charpentier F. J., Stella, M. G., Diphone Synthesis Using an Overlap-Add Technique for Speech Waveforms Concatenation. Proc. Int. Conf. on Acoustics, Speech and Signal Processing, Tokyo, pp. 2015-2018, (1986).
Chenfour N., Réalisation d’un système de synthèse de la parole arabe à partir du texte par concaténation de di-syllabes, thèse pour l’obtention du titre de docteur de troisième cycle, Faculté des sciences Rabat, Université Mohammed V, (1997).
_________ , Réalisation d’un système de synthèse de la parole arabe à partir du texte (PARADIS) : Etude et génération des pauses et des durées syllabiques, thèse pour l’obtention du titre de docteur en informatique, ENSIAS Rabat, Université Mohammed V, (2001).
Chenfour N., Benabbou A, Mouradi A., Synthèse de la parole arabe TD-PSOLA, génération et codage automatiques du dictionnaire, ISIVC’2000, RABAT, pp. 112-122, (2000).
Chomsky N., Halle M., The Sound Pattern of English, Harper & Row, New York, (1968).
Dutoit T., High Quality Text-to-Speech of the French Language, Ph. D. dissertation, Faculté polytechnique de Mons, (1993).
Klatt D. H., Klatt L. C., « Analysis, Synthesis, and Perception of Voice Quality Variations among Female and Male Talkers », J. Acoust. Soc. Amer., vol. 87, n° 2., pp. 820-857, (1990).
Markel J. D. And A. H. Gray, Jr., Linear Prediction of Speech. Springer-Verlag, Berlin Heidelberg, New York, (1976).
Moulines E., Charpentier F. J., « Pitch Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones », Speech Communication, vol. 9, n° 5-6, (1990).
Mouradi A., Synthèse de la parole arabe à partir du texte par la méthode des diphones, thèse de Doctorat, Fac. des Sciences, Rabat, Université Mohammed V, (1985).
Saito S., Itakura, F., The Theoritical Consideration of Statistically Optimum Methods for Speech Spectral Density, Report n°.3107, Electrical Communication Laboratory, N.T.T., Tokyo, (1966).
Taori R., Sluijter R.J. And Kathman E., Speech Compression Using Pitch Synchronous Interpolation, ICASSP 95 (pp. 512-515), (1995).
Xavier Marsault, Compression et cryptage en informatique, Edition Hermes, (1992).