speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Everything about unified interface


From: Michael Pozhidaev
Subject: Everything about unified interface
Date: Fri, 02 Jul 2010 08:54:24 +0700

Good morning, Tomas!

I am writing here now, let continue:

> Gnome Speech (deprecated), there is SD, OpenTTS and VoiceMan.  I actually 
> still hope SD
> and OpenTTS can join again and it would be great to find a common way forward 
> with
> VoiceMan developers too.  Standardization may be a good way.

Yes, you're completely right. I would like to talk about some
my conclusions relative to standarts. Please, fill free to correct if
something is wrong here. As I see there are two approaches to unified
interface (both in cases of D-Bus and API for programming
languages). They must be distinguished.

1. the interface designed to provide maximum information to client about
accessible engines , such as languages, dialects, capabilities,
features, etc. This interface allows client to select exact parameters
it want  for speech synthesizing. This approach is suitable for
applications, which want to control everything by themselves and want to
know everything;

2. another approach: the interface for applications, which "just want to
speak". These applications can provide speech information, pitch, rate,
mark emphasized blocks etc but know nothing what provider is used and
with what capabilities. For example, I may want to write my own
application with speech interface, and  do this if provider supports feature 
Xfoo
(ssml for example), and do that if it doesn't, is not convenient for
me. Text processing behavior must be defined inside of provider and
stored with it. It is not a client concern.

So, it seems to me we must explicitly determine what approach we choose
to get unified standart. Both of them are required but used for
different tasks.VoiceMan basically keeps the second one. As I figured
out TTS API was designed mostly for first.

One more important thing is the national languages. Russian users always
use two TTS's: one for English text and one for Russian. There are no
problems to select needed due to completely independent character
sets. The more complex situation is in Ukraine community. Ukraine users
use three languages: English, Ukraine and Russian. Almost all of them
can speak Russian freely and always read Russian sites and so on. But
Russian and Ukrain character sets have lot of common letters and have
their own. There are no explicit rules how determine language and and
there is no good heuristic. So, Ukraine users read English text by
automatic switching and use manual switching for Russian-Ukraine
languages. There are no problems with it in VoiceMan. User can enumerate
required TTS's in configuration file and use hot configuration reloading
to select what language must be assumed for Cyrillic letters.

This long story was to illustrate requirement to have not the single
active language but set of active languages. As I read in TTS API multiple
languages can be processed with explicit marks in ssml but in general
case we have not such marks, automatic language switching must be
provider concern if it is possible. Please, sorry, if I missed
something, the document is large and covers a lot of things.

One more question - Java applications. TTS API document considers ssml
but we Java applications may want to generate speech commands in
jsml. Clear strategy to convert jsml commands to any interface what ever
it can be is strongly required now. It would be very interesting for me
to participate in TTS API document developing and make it better. 

What do you thing about these things? It seems to me we must consider
unified D-Bus interface (respecting TTS API document) but before we must decide
what tasks it has to cover.

>> unified interfaces as it was done for at-spi-2?
>
> I never said I don't!  Actually, we pushed for it for many years.  We were, 
> however,
> long ignored by Gnome Speech developers.  Now, when the situation starts to 
> change, I
> hope this can be done.

Sorry, let decide what task this interface must solve and let propose
something.

> We will see.  The problem in FOSS is that you can't force people to 
> cooperate.  But that
> is also one of its beauties, so lets live with that and try to be
> constructive.

OK!

>> We can go to spd mailing list and continue there.
>
> Good idea.
>
> Best regards, Tomas

-- 
Michael Pozhidaev. Tomsk, Russia. E-mail: msp at altlinux.ru
Russian info page: http://www.marigostra.ru/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]