bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Small problems with gettext 0.11 on Solaris


From: Bruno Haible
Subject: Re: Small problems with gettext 0.11 on Solaris
Date: Tue, 19 Feb 2002 15:30:28 +0100 (CET)

Drazen Kacar writes:

> The first class of problematic checks looks like this:
> 
> msgcat: mcat-test2.in2: warning: Charset "UTF-8" is not supported. msgcat 
> relies
> on iconv(),
> 
> The first thing to note is that the warning message might stand some
> improvement.

The actual error message was longer than that, but the msgcat-* tests
have partially filtered it away...

> However, the meaningful message should include both charset names
> which were passed to iconv_open() and not just one.
> 
> Upon investigation in the debugger, I've found that the attempted
> conversion was from UTF-8 to UTF-8.

Why not?

> Solaris iconv_open() will indeed return an error if one attempts
> this.

I'd send a bug report to the Sun people.

> but I'm wondering if you would be interested in checking for this case in
> the gettext code (function po_lex_charset_set in po-charset.c printed the
> warning). There isn't much point in the overhead caused by unnecessary
> calling iconv() in this case.

The point is verifying that the input is indeed well-formed UTF-8.

> The second problem is that msgconv-1 invoked abort() in msgconv utility,
> but that looks like a Solaris bug.

Yes, I saw this as well. IIRC, Solaris iconv() returns success, and
increments the input pointers, but not the output pointers.

> The next problem is that msgcomm-4, msgcomm-5, msgcomm-6 and msgcomm-7
> fail because iconv() doesn't handle conversion from ASCII to ISO-8859-1.
> It does not, because ASCII is not a valid input character set name. This
> looks serious enough, so I'll do something about it. But could you tell me
> how msgcomm came to the conclusion that it needs to perform the conversion
> from ASCII to ISO-8859-1 (or anything else)? The catalogs in the test
> files specify iso-8859-1 for charset, so I'm not quite sure where ASCII
> comes into picture.

The mcomm-test4.in2 file has no header entry with charset
specification and is therefore assumed to be in ASCII.

Bruno



reply via email to

[Prev in Thread] Current Thread [Next in Thread]