speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Idea: language plugins


From: Jonathan Duddington
Subject: Idea: language plugins
Date: Mon, 03 Mar 2008 11:36:47 +0000 (GMT)

On 03 Mar, Tomas Cerha <cerha at brailcom.org> wrote:
> I understand it is a tempting idea to manage those conversions
> centrally and be able to do a good job even with a very dumb
> synthesizer, but I also believe Speech Dispatcher is not a good place
> for this.  One example of a technical problem of such a design is the
> support for callbacks.  If the text is changed on its way from
> application to the synthesizer, the indexes reported by the
> synthesizer are no longer valid within the original text.

I like the idea of language-specific plugins in a centralised place 
(i.e. in Speech Dispatcher) which can be used with different
synthesizers.

eSpeak does some basic language-specific interpretation for numbers,
where it provides a number of options which can be set for each
language.  For example whether "123" is:
  One hundred and twenty three.
  Hundred twenty three.
  Hundred three and twenty.
  etc.

But applying different grammatical inflections to numbers, depending on
context, is beyond the ability of a general-purpose multi-language
synthesizer such as eSpeak.  This may also apply to questions about how
to interpret text such as "2/3/04" or numbers which might be telephone
numbers.

It would be useful if someone who has the knowledge and interest to do
this could write a language-specific plugin.  This operates at the text
level (eg. converting numbers to text) and is independent of which
synthesizer is used.

There is certainly a problem with indices into the text, but it can be
solved.  eSpeak has a similar problem internally when it processes SSML
tags.  This produces text for speaking which is different from the
original text which it received.   It is solved by a table which
translates the indices in the output text to the equivalent indices in
the input text.

An alternative would be to provide a plugin interface to eSpeak for
these functions, but Speech Dispatcher seems a more suitable place.

-- 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]