bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness


From: Eli Zaretskii
Subject: Re: Emacs, XEmacs, X11(?), "man"(?) i18n/utf-8 brokenness
Date: Tue, 24 May 2005 22:14:55 +0300

> From: Olaf Klischat <address@hidden>
> Date: Tue, 24 May 2005 13:42:41 +0200
> Cc: 
> 
> http://user.cs.tu-berlin.de/~klischat/emacs-i18n-broken-by-design.png
> 
> I.e. several instances of the german umlaut "ΓΌ" in a buffer, some of
> which are found by isearch, while others aren't.

I think these problems are solved in the current CVS, and will go away
completely once a Unicode based Emacs is released (don;t ask me when,
but there's a CVS branch where people actively work on this).

> Looks like a design error to me -- it should
> store buffer contents internally as a sequence of Unicode codepoints,
> not as sequences of bytes + encoding (which is what I presume it
> does atm).

Historically, the multilingual Emacs was based on an encoding other
than Unicode, where Latin-n character sets don't intersect.

> When running under that locale, the "man" program (or is it nroff, or
> troff, or groff?), for reasons that are beyond me, decides to turn the
> perfectly valid ASCII chracter 0x27 ("'", U+0027 APOSTROPHE) into the
> UTF-8 sequence 0xe2 0x80 0x99 [1], which, according to
> http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder,
> is the chracter U+2019 RIGHT SINGLE QUOTATION MARK (similar things
> happen with the "-" chracter, and probably others).

I'm guessing that Groff automatically uses the UTF-8 encoding and
passes the -Tutf8 option to the TTY driver.

> I don't know who is to blame for all this. Are those automatic
> character conversions mandated by some standard?

Some of them.  You should read the manuals and complain to the
respective maintainers.

> All things considered, it seems that it is still quite impossible (or
> should one say "adventurous"?) to use GNU and Emacs for programming
> tasks under multibyte encodings.

Some of the problems you mention have nothing to do with Emacs.

Anyway, thanks for the reports.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]