bug-standards
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII


From: Alfred M. Szmidt
Subject: Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII
Date: Sat, 02 Mar 2019 03:44:51 -0500

   >     ASCII is the common subset of both UTF-8 and most single-byte charsets,
   >     specially ISO-8859-X.
   >
   > Can you define what you mean with "pure ASCII"?

   As you can see, I already defined ASCII in the message you are 
   answering. ASCII (the common subset of both UTF-8 and ISO-8859-X) are 
   the characters 0x00 to 0x7F. These are the only single-byte characters 
   in UTF-8. Characters >= 0x80 are multibyte in UTF-8 and are shown as two 
   or more random characters in 8-bit terminals.

I asked what you defined as "pure" and "unpure ASCII".  What is
"unpure ASCII"?  I gather that by "pure ASCII" you mean exactly 7-bit
ASCII, is that correct?

   > Initially you mentioned issues viewing things on terminals.  That has
   > little to do with not using the eighth bit in a byte, since about one
   > third of the ASCII table is not viewable on dumb terminals.

   I never mentioned dumb terminals, but single-byte (8-bit) terminals, 
   like those showing ISO-8859-X. 

I took simple to mean dumb, what is a simple terminal? Is it something
like a VT100? Those cannot display ISO-8859-X or UTF-8 properly, but
they handle those just fine.

   Let's see four examples. (TL,DR, multibyte UTF-8 characters look like 
   shit on 8-bit terminals).

   This is how maintain.txt should look:

Mike Gerwitz mentions that it is intentional that the GCS etc should
be using UTF-8, not 7-bit ASCII.  So it definitly should not use 7-bit
ASCII quotes in that setting.

   This is how maintain.txt looks on my ISO-8859-15 terminal viewed with ed:

What is a ISO-8859-15 terminal? Is it compatible with VT100?  But if
you are using the wrong locale, of course the results will be strange.

Did you try running iconv? Did it work?

   >     This is specially important for 'maintain.info' because it can't be
   >     converted (the tag table becomes incorrect). Any user of a single-byte
   >     terminal will need to rebuild 'maintain.info' from source (as I need to
   >     do to see it in one of my machines).
   >
   > Why should it be converted? Info files are meant for an Info viewer,
   > it would be the task of the Info viewer to adjust its locale.

   I know of no info viewer able to convert every UTF-8 character to a 
   printable character in ASCII or ISO-8859-X. The info viewer in this 
   machine (info (GNU texinfo) 4.13+) is not even able to convert the 
   three-byte UTF-8 quotes present in maintain.info to ASCII.

That is not what I wrote though, I did not say that Info should
convert anything only adjust its locale.  There is enough information
in the Info file to deduce the encoding (i.e. coding: utf-8 at the
bottom).

   So, please, could maintain.txt and maintain.info be coded again in ASCII 
   (as advertised) for maximum compatibility with UTF-8 and 8-bit 
   terminals? Thanks.

Lets not make hasty random changes, specially when this one was very
explicit.  Lets take the time to understand the problem first, and
then find a solution.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]