speechd-discuss
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

ibmtts output module and utf8


From: Lukas Loehrer
Subject: ibmtts output module and utf8
Date: Tue, 31 Jul 2007 22:30:40 +0200

Olivier BERT writes ("Re: ibmtts output module and utf8"):
> However, I think that we haven't resolved the UTF-8 issue for all cases. 

The current version in CVs does not do the right conversion if SSML is
stripped. I am working on a patch that I hope will do the right thing
in both cases.

> The problem is that we haven't any documentation for the charset that must
> be use as IBM Viavoice input. 

I believe we know in principle which charsets are expected in which cases and 
for
which languages. The tricky problem is that the SSML filter if
available gets activated automatically and permanently if the input
looks similar enough to SSML and with this comes a change in the
expected encoding. It would be useful to query at run time whether the
SSML filter is active, or at least what the expected charset is.

As it stands right now, I propose the following steps for module_speak
in the ibmtts module:

1. Check if the input is valid utf-8; fail if not.

2. If SSML stripping is enabled, do so and then convert the result
from utf-8 to the expected encoding for the currently selected
language (cp1252 or utf-16). Otherwise, perform no conversion.

3. Pass the result to ibmtts.

Best regards, Lukas



reply via email to

[Prev in Thread] Current Thread [Next in Thread]