emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Encoding of etc/HELLO


From: Eli Zaretskii
Subject: Re: Encoding of etc/HELLO
Date: Sat, 19 May 2018 21:03:24 +0300

> Cc: address@hidden
> From: Paul Eggert <address@hidden>
> Date: Sat, 19 May 2018 10:17:33 -0700
> 
> In looking at the new etc/HELLO, I see many uses of <x-charset><param> that 
> seem 
> to be unnecessary when Emacs is viewing the file. For example, the first few 
> uses are:
> 
> <x-charset><param>latin-iso8859-1</param>¡Hola!, Grüß Gott, Hyvää päivää, 
> Tere 
> õhtust, Bon</x-charset><x-charset><param>latin-iso8859-3</param>ġu
>            Cze</x-charset><x-charset><param>latin-iso8859-2</param>ść!, Dobrý 
> den, </x-charset>

Which parts seem unnecessary in this snippet?  And why?

> Can't the abovementioned formatting commands be removed without affecting 
> what 
> any Emacs user sees, because the corresponding character sets are not unified 
> in 
> Unicode?

What do you mean by "unified" here?  In modern Emacs, we don't need to
unify the charsets, because they no longer determine the codepoints.
The 'charset' property just tells Emacs to which "culture", so-called,
or, if you want, to which language the greeting belongs, and the
purpose is only one: selection of an appropriate font to display that
greeting.  (In the future we might use that for other
language-dependent features.)

> Would it be OK to simplify /etc/HELLO to remove unnecessary formatting 
> commands, and to keep only the formatting commands that are plausibly needed 
> in 
> a Unicode text file? And if so, what heuristic should be used to remove the 
> unnecessary formatting commands?
> 
> I assume that the formatting commands were done automatically, so perhaps I'm 
> talking about potential changes to lisp/textmodes/enriched.el.

Yes, the annotations were produced automatically by enriched.el, but
they simply follow what was already there in the original HELLO.  You
can see that by visiting HELLO on the emacs-26 branch, and then
invoking "M-x describe-text-properties" at various places in the file.
You will see that the annotations start and end where the 'charset'
properties started and ended in the ISO-2022 encoded file.

We could, of course, place the 'charset' properties only on the
greetings and the language names, leaving the rest of the text without
any 'charset' properties.  If that's what you mean, then I'm okay with
doing that; one could use the new facemenu command I added for that
purpose.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]