bug-gnu-libiconv
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gnu-libiconv] iconv incorrectly converts escape characters 0x1b


From: Bruno Haible
Subject: Re: [bug-gnu-libiconv] iconv incorrectly converts escape characters 0x1b from UTF-8 to ISO-2022-JP
Date: Tue, 24 Mar 2015 04:22:58 +0100
User-agent: KMail/4.8.5 (Linux/3.2.0-64-generic; KDE/4.8.5; x86_64; ; )

Hello,

> ISO-2022-JP is one of the popular character encoding schemes for email
> texts in Japan.

I don't think that it is still popular, for 20 or 30 years already, as it
cannot encode half-width Katakana characters (it can only encode Katakana as
full-width characters, which is extremely unusual).

Try ISO-2022-JP-2 or ISO-2022-JP-3 instead. That's why these encodings
were created.

See https://en.wikipedia.org/wiki/ISO/IEC_2022#ISO.2FIEC_2022_character_sets

> I report incorrect conversion by iconv w.r.t. ISO-2022-JP.

> The byte value 0x1b in UTF-8 text is converted to the same byte value
> in ISO-2022-JP by iconv.

Since the byte value 0x1b is used as escape character in the ISO-2022-*
family of encodings, and these encodings provide no way to encode a ESC
character as such, "byte value 0x1b in UTF-8 text" is invalid input for
such a conversion. In other words, use ASCII without ESC characters,
or UTF-8 without ESC characters, as input.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]