bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gettext 0.10.37 msgfmt charset naming compatibility on Solaris 8


From: Paul Eggert
Subject: Re: gettext 0.10.37 msgfmt charset naming compatibility on Solaris 8
Date: Wed, 16 May 2001 11:38:12 -0700 (PDT)

> From: Bruno Haible <address@hidden>
> Date: Wed, 16 May 2001 18:00:59 +0200 (CEST)
> 
>   1) Why doesn't Solaris iconv() not accept "ISO-8859-1" as a
>      conversion name? It's a name registered by ISO and IANA for many
>      years now.

Beats me.  I'll file an enhancement request, and will mention the
conversion names mentioned in charset.alias.

But even if they patch it, many current systems will remain in the
field for some years.  (Sun folks tend to move a bit slower than
GNU/Linux folks.  :-) So it's worth catering to the old names, if it's
not too much trouble.

>      Vendors whose iconv does not accept standard names are likely
>      to also not put enough manpower in the converters themselves.

The Solaris converters are quite good, at least in the areas where we
use them (mostly in Japanese environments).  These cases often involve
proprietary encodings and/or private extensions that GNU libiconv is
unlikely to know about, much less be precisely compatible with.  It's
possible that some of these encodings will never be supported by GNU
libiconv, as they're rather old-fashioned and specialized.

Also, it may be the case that opinions differ on what is a "correct"
mapping, and it's better to have the option of using the native
opinion, as that will help promote interoperability in some cases.

> So if you want to use Solaris iconv with GNU gettext and without
> warnings, a patch that I would accept would consist of the following:
> 
> 1) An autoconf test which checks for each of the charsets listed in
>    po.c
>      a. under which name this charset is available,
>      b. whether it converts according to the standard tables used
>         by glibc and libiconv.
>    Such a check shouldn't make the "configure" file three megabytes
>    large, of course. I have more ideas on this step.

But this would require that the build machine have all the character
set support that the runtime machine has.  It's quite common to build
on English-only hosts, and then run on Japanese hosts.

How about instead modifying config.charset so that it will output the
tables in question?  It's already outputting the forward mapping in a
hard-coded way; it shouldn't be hard to modify it output the reverse
mapping, for hosts where this works.  You already have to maintain
config.charset anyway, so it wouldn't be much more work to maintain
it after the change.

> 2) A wrapper function, 'iconv_open_wrapper' similar to your iconvariant
>    function, which uses the autoconf test's results. In particular
>    it should convert GNU canonical charset names to vendor names,
>    and reject encodings for which the autoconf test has determined
>    that the vendor's iconv is broken.
>
> Note that this iconv wrapper would have to be used by intl/ as well,
> not only by the lib/ and src/ part of GNU gettext.

Yes, that sounds better than the hack that I proposed.  I can
volunteer to write this, if it would help.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]