emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: character sets as they relate to “Raw” string literals for elisp


From: Eli Zaretskii
Subject: Re: character sets as they relate to “Raw” string literals for elisp
Date: Tue, 05 Oct 2021 15:04:15 +0300

> From: Daniel Brooks <db48x@db48x.net>
> Cc: emacs-devel@gnu.org
> Date: Mon, 04 Oct 2021 13:49:53 -0700
> 
> I see that prolog-mode only gets a few commits per year (9 last year and
> 5 so far this year; the high water mark is 10 in a single year). It
> imposes a pretty minimal support burden and if it has bugs you can
> simply ignore them until a Prolog user brings you a patch, because those
> bugs can only affect Prolog users. There is a lot of code in Emacs which
> fits this description.
> 
> Suppose this hypothetical contribution were a language mode for a
> Japanese programming language, and thus had the same support profile?
> Suppose also that all messages to the user have already been localized
> into English, and that there is an English alias for the mode name (that
> is, `日本-mode' toggles the mode, but there’s an alias like `ja-mode' or
> something), while the rest of the identifiers are in Japanese.
> 
> Would there be any reason to turn away that contribution, or to make the
> contributor rewrite it?

I'm sorry, this is too abstract and theoretical issue, with many
important details missing.  So I don't think it will be useful to
seriously consider such a theoretical example.

> >>     (defvar variable-containing-html #r「<a href="foo.html">click here</a>」)
> >
> > If we avoid non-ASCII characters, we avoid some problems, so all else
> > being equal, it's better.
> 
> Hmm. If we (speaking as broadly as possible!) avoid a problem forever,
> how will the problem ever get fixed?

I don't think it needs fixing.

> Personally, I think that the problems are now mostly fixed. Emacs has
> very complete support for character sets, better than virtually all
> other applications. Outside of Emacs, support for Unicode is practically
> omnipresent as well. There are still notable gaps, like the Linux
> console, but they are the exception rather than the rule. I don’t think
> that there is much of a problem left to avoid!

It turns out there are more exception than we imagine.  We just now
had another bug report, this time about Kitty terminal emulator, which
has yet another set of issues with displaying non-ASCII characters
from Emacs.  So much so that I was prompted to add an entry in
etc/PROBLEMS with some workarounds for users of Kitty.  Granted, their
problems are not that they don't support recently added Unicode
characters, it's that they support them "too well".  B ut still, it
doesn't help when the result is a messed-up display.

> I prefer to say “Linux console” in reference to the one terminal
> emulator that we know has severe problems with Unicode. There are many
> terminal emulators out there, and I’m sure a few of them have problems,
> but for the most part I think all of them can handle Unicode pretty well
> primarily because they all rely on OS libraries to do the heavy
> lifting.

Unicode is not a static target, it's a moving one.  They issue a new
version of the standard twice a year, and each new version adds new
codepoints with new attributes.  If a new version of Unicode adds
double-width characters, and some terminal emulator doesn't keep up,
you will have problems displaying those new codepoints.  (AFAIK,
that's in essence the problem with the Linux console: they last
updated when Unicode 5.0 was released.)

So it might be possible to say that many terminals support substantial
portions of Unicode, but it definitely is NOT right to say that we can
freely use any character we want and think they will work everywhere.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]