[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF
From: |
stephen |
Subject: |
Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files |
Date: |
Sun, 27 Sep 2015 09:12:51 +0900 |
>>>>> Paul Eggert writes:
> Eli Zaretskii wrote:
>> So you are, in effect, saying that it is incorrect to derive the
>> default encodings from the locale's codeset?
> Yes, for Emacs developers.
I think this makes sense. IIUC Emacs already uses characters outside
of the Unicode repertoire, so it shouldn't be too hard to replicate
any Emacs capabilities that require non-Unicode characters or charsets
*inside* Emacs by using such characters. Assuming there are any; I
suspect even HELLO doesn't actually need them. There's no "gaiji"
problem of how to tell Emacs what to do with those characters; the
developer who introduces them into Emacs is responsible for adding
them to Emacs's non-Unicode repertoire.
> And come to think of it, for most Emacs users.
I hope not, because that would imply that Emacs users in China, Japan,
probably Korea, and Taiwan are becoming a decreasing rather than
increasing fraction of Emacs users.
> Nowadays in my experience most non-ASCII text files use UTF-8,
> regardless of locale.
Toto, I don't think we're in Kansas any more.
> The old days of having to guess encoding from the locale are
> passing away. This is partly due to UTF-8 being the encoding of
> choice for HTML and XML, where UTF-8 overtook the older 8-bit
> encodings in 2008 and now is by far the dominant encoding.
On the commercial internet, yes, but not for government and academic
sites in Japan and China.
> One way to accommodate the new reality would be to
Recognize that it's probably due to insufficient experience?
> change Emacs so that by default the system locale does not affect
> Emacs's guess of a file's encoding if the file's initial sample is
> valid UTF-8.
"Not affect" is probably a bad idea. Giving UTF-8 too strong
preference on Windows is a bad idea, because there are a lot of
Windows coding systems that use UTF-8 trailing bytes to represent
characters; it's occasionally possible to run into UTF-8-conforming
files that are intended to be something else. This isn't true for
ISO-8859 coding systems.
> Users could set a variable to re-enable the old behavior. If we
> did this, we wouldn't have the error-prone process if sprinkling
> 'coding: utf-8' cookies all over the place.
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, (continued)
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, David Kastrup, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, David Kastrup, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Eli Zaretskii, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/26
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files,
stephen <=
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, stephen, 2015/09/27
- Re: [Emacs-diffs] master db828f6: Don't rely on defaults in decoding UTF-8 encoded Lisp files, Paul Eggert, 2015/09/27