[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv

From: jake
Subject: [bug-gnu-libiconv] Re: 3 char from UTF-8 to MacRoman iconv
Date: Wed, 2 Jul 2008 01:31:19 -0400 (EDT)

address@hidden wrote:
While playing with libiconv 1.12 I noticed that it is unable
to convert the 3 particular characters from UTF8 to MACROMAN
(going from MACROMAN to UTF8 works fine). They are

CE A9       189 937     0xBD    U+03A9  &Omega
E2 82 AC    219 8364    0xDB    U+20AC  &euro
EF A3 BF    240 63743   0xF0    U+F8FF  Apple logo

Is this a known issue?

Yes. When you follow the links from
 -> Macintosh encodings
 -> Mac-Roman
you see that various converters implement Mac-Roman differently, especially
around the byte values that you mention.

0xBD = U+2126 OHM SIGN  or  U+03A9 GREEK CAPITAL LETTER OMEGA - how do you
want to decide which is right?

0xDB = U+00A4 CURRENCY SIGN  or  U+20AC EURO SIGN - this is an incompatible
change. Even if Apple did this change.

0xF0 =? U+F8FF is a "private-use" character. It may be APPLE LOGO on Apple
systems, but on Linux systems it's more likely to be used as a chinese

Ahhh, those comparison pages are indeed useful, thank you, Bruno.

The multiple unicode codepoints for 0xBD and 0XDB will result two
different unicode strings to be translated into the same MACROMAN
string, making the "return trip" ambiguious. I am curious though
since libiconv already does make a decisive choice when going from
MACROMAN to UTF8(instead of rejecting those characters),
wouldn't it make sense for it to choose the same consistent
behavior from UTF->MACROMAN?

I am still unclear about the motivation behind Apple Logo,
because even when I am on a linux system(which I am)
it's private-use U+F8FF should still get translated into
ASCII 240(0xF0). should it?

I appreciate your further clarification.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]