[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] Issue when using iconv 2.12 on RHEL 6.7

From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] Issue when using iconv 2.12 on RHEL 6.7
Date: Fri, 07 Apr 2017 19:21:17 +0200
User-agent: KMail/5.1.3 (Linux/4.4.0-71-generic; KDE/5.18.0; x86_64; ; )


Lim, Yongkeong wrote:
> I have a data file which we managed to convert using macbook running on
> iconv (GNU libiconv 1.11), no characters got deleted after conversion.
> But when we upload the same file to the RHEL server running on iconv
> (GNU libiconv 2.12), some characters got deleted by the iconv function.
> Below is the command we used:
> iconv -c -f iso-8859-11 -t utf-8 <source file> > <output file>

The second machine is using iconv from GNU libc, not GNU libiconv.
So, it's two different implementations of the iconv facility.
But both have very similar conversion tables.

For Thai, your file could be in encoding TIS-620, ISO-8859-11, or
Mac-Thai. [1] The conversion tables used by GNU libiconv and GNU libc
for ISO-8859-11 are identical [2], and likewise for TIS-620 [3].

I'd suggest that you
  1) Don't use the option "-c" of iconv - this option produces lossy
     output by design.
  2) Instead, try harder to find the right encoding. That is, try
     iconv -f iso-8859-11 -t utf-8 source > output1
     iconv -f tis-620 -t utf-8 source > output2
     iconv -f macthai -t utf-8 source > output3
     and compare the resulting three output files.

Also, in general, ISO-8859-11 should not be used, since it is *not*
standardized - unlike TIS-620, which is a (national) standard. See [4],[5].


[1] https://haible.de/bruno/charsets/conversion-tables/Thai.html
[2] https://haible.de/bruno/charsets/conversion-tables/ISO-8859-11.html
[3] https://haible.de/bruno/charsets/conversion-tables/TIS-620.html
[4] https://en.wikipedia.org/wiki/ISO/IEC_8859-11
[5] https://en.wikipedia.org/wiki/Thai_Industrial_Standard_620-2533

reply via email to

[Prev in Thread] Current Thread [Next in Thread]