bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] [PATCH] Support nl_langinfo (CODESET) correctly o


From: KO Myung-Hun
Subject: Re: [bug-gnu-libiconv] [PATCH] Support nl_langinfo (CODESET) correctly on OS/2
Date: Tue, 06 Aug 2019 14:11:41 +0900
User-agent: Mozilla/5.0 (OS/2; Warp 4.5; rv:10.0.6esrpre) Gecko/20120715 Firefox/10.0.6esrpre SeaMonkey/2.7.2

Hi/2.

Bruno Haible wrote:
> Hi KO,
> 
>> diff --git a/libcharset/lib/localcharset.c b/libcharset/lib/localcharset.c
>> index da3ac45..40923fc 100644
>> --- a/libcharset/lib/localcharset.c
>> +++ b/libcharset/lib/localcharset.c
>> @@ -378,26 +378,41 @@ static const struct table_entry alias_table[] =
>>         by Alex Taylor:
>>         <http://altsan.org/os2/toolkits/uls/index.html#codepages>.
>>         See also "IBM Globalization - Code page identifiers":
>> -       <https://www-01.ibm.com/software/globalization/cp/cp_cpgid.html>.  */
>> -    { "CP1089", "ISO-8859-6" },
>> -    { "CP1208", "UTF-8" },
>> -    { "CP1381", "GB2312" },
>> -    { "CP1386", "GBK" },
>> -    { "CP3372", "EUC-JP" },
>> -    { "CP813",  "ISO-8859-7" },
>> -    { "CP819",  "ISO-8859-1" },
>> -    { "CP878",  "KOI8-R" },
>> -    { "CP912",  "ISO-8859-2" },
>> -    { "CP913",  "ISO-8859-3" },
>> -    { "CP914",  "ISO-8859-4" },
>> -    { "CP915",  "ISO-8859-5" },
>> -    { "CP916",  "ISO-8859-8" },
>> -    { "CP920",  "ISO-8859-9" },
>> -    { "CP921",  "ISO-8859-13" },
>> -    { "CP923",  "ISO-8859-15" },
>> -    { "CP954",  "EUC-JP" },
>> -    { "CP964",  "EUC-TW" },
>> -    { "CP970",  "EUC-KR" }
>> +       <https://www-01.ibm.com/software/globalization/cp/cp_cpgid.html>.
> 
> This URL is dead. You can remove it, or replace it with another suitable one.
> 

Ok.

>> +       See also "__convcp() of kLIBC":
>> +       
>> <http://trac.netlabs.org/libc/browser/branches/libc-0.6/src/emx/src/lib/locale/__convcp.c>,
>> +       or:
>> +       
>> <https://github.com/bitwiseworks/libc/blob/master/src/emx/src/lib/locale/__convcp.c>.
>>   */
> 
> The first of these two URLs is broken. You can therefore remove it.
> 

Ok.

>> +    { "CP1089",         "ISO-8859-6" },
> 
> OK
> 
>> +    { "CP1200",         "UCS-2" },
> 
> UCS-2 cannot be used as a locale encoding, since it is not ASCII compatible. 
> This
> cannot work.
> 

setlocale() on kLIBC accepts "IBM-1200", "UCS-2" and so on.
Nevertheless, UCS-2 should be removed ?

>> +    { "CP1208",         "UTF-8" },
>> +    { "CP1381",         "GB2312" },
>> +    { "CP1383",         "EUC-CN" },
>> +    { "CP1386",         "GBK" },
>> +    { "CP3372",         "EUC-JP" },
>> +    { "CP813",          "ISO-8859-7" },
>> +    { "CP819",          "ISO-8859-1" },
>> +    { "CP878",          "KOI8-R" },
>> +    { "CP912",          "ISO-8859-2" },
>> +    { "CP913",          "ISO-8859-3" },
>> +    { "CP914",          "ISO-8859-4" },
>> +    { "CP915",          "ISO-8859-5" },
>> +    { "CP916",          "ISO-8859-8" },
>> +    { "CP920",          "ISO-8859-9" },
>> +    { "CP921",          "ISO-8859-13" },
>> +    { "CP923",          "ISO-8859-15" },
>> +    { "CP954",          "EUC-JP" },
>> +    { "CP964",          "EUC-TW" },
>> +    { "CP970",          "EUC-KR" },
>> +    { "ISO8859-1",      "ISO-8859-1" },
>> +    { "ISO8859-2",      "ISO-8859-2" },
>> +    { "ISO8859-3",      "ISO-8859-3" },
>> +    { "ISO8859-4",      "ISO-8859-4" },
>> +    { "ISO8859-5",      "ISO-8859-5" },
>> +    { "ISO8859-6",      "ISO-8859-6" },
>> +    { "ISO8859-7",      "ISO-8859-7" },
>> +    { "ISO8859-8",      "ISO-8859-8" },
>> +    { "ISO8859-9",      "ISO-8859-9" }
> 
> OK
> 
>> @@ -751,6 +766,24 @@ locale_charset (void)
>>      }
>>  #  endif
>>  
>> +#  ifdef OS2
>> +  /* On OS/2, nl_langinfo (CODESET) returns IBM-XXX style normally. Convert 
>> it
>> +     to CPXXX style for mapping later except UCS-2LE and UCS-2BE.  */
>> +  if (strcmp (codeset, "IBM-1200@endian=little") == 0)
>> +    return "UCS-2LE";
>> +  else if (strcmp (codeset, "IBM-1200@endian=big") == 0)
>> +    return "UCS-2BE";
>> +
>> +  if (strncmp (codeset, "IBM-", 4) == 0 && isdigit (codeset[4]))
>> +    {
>> +      static char buf[2 + 10 + 1];
>> +
>> +      snprintf (buf, sizeof (buf), "CP%s", codeset + 4);
>> +
>> +      codeset = buf;
>> +    }
>> +#  endif
>> +
> 
> I would prefer if you could get rid of this extra code, and instead just add
> entries to the table above.
> 

Ok, I'll try.

>> +    { "CP1381",         "GB2312" },
>> +    { "CP1383",         "EUC-CN" },
> EUC-CN is not among the canonical encoding names, defined in localcharset.h.
> Please use GB2312 instead.
> 

Ok.

I'll try soon. Thanks!

-- 
KO Myung-Hun

Using Mozilla SeaMonkey 2.7.2
Under OS/2 Warp 4 for Korean with FixPak #15
In VirtualBox v6.0.8 on Intel Core i7-3615QM 2.30GHz with 8GB RAM

Korean OS/2 User Community : http://www.os2.kr/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]