bug-standards
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII


From: Alfred M. Szmidt
Subject: Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII
Date: Tue, 26 Feb 2019 12:06:56 -0500

   > I have noticed that maintain.txt and maintain.info[1] are no longer in 
   > ASCII, but in UTF-8. In particular they contain lots of easily avoidable 
   > UTF-8 quoting characters (single and double quotes) that break 
   > displaying them in non-UTF-8 terminals. This is a pity because the main 
   > use of such simple formats is to be displayed in simple terminals.

I'm not sure what is the definition of "ASCII" here, are you talking
about "printable" characters?  In that case, the Info format has
always contained non-printable/non-ASCII characters, most notably #o37
for section splitting, the "#o0 #10 [" sequence for images, etc.  So
these files have never been very readable on "simple text terminals"
(what do you mean by that more exactly? VT100 dumb terminal?).

For the text files, I think it still makes more sense to use UTF-8,
the default locale these days on GNU/Linux is UTF-8, and many of the
command line tools will output UTF-8 style quoting characters if that
is so.  

Could you run your files through iconv and convert them from UTF-8 to
ASCII?  Maybe,

        iconv -f UTF-8 -t ASCII file...

   > Given that there is just one letter out of the ASCII range in 
   > maintain.{txt,info} (the 'é' in 'risqué'), could it be possible to keep 
   > these files as pure ASCII? Thanks.

990 matches in 490 lines for "[^[:ascii:]]" in buffer: maintain.txt
988 matches in 489 lines for "[^[:ascii:]]" in buffer: maintain.info

These are mostly quotes, but you have bullets and copyright, em-dashes
as well.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]