bug-standards
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII


From: Alfred M. Szmidt
Subject: Re: [gnu.org #1363250] ASCII maintain.txt is no longer ASCII
Date: Fri, 01 Mar 2019 16:56:53 -0500

   > * ASCII is a well defined standard, and all ASCII is UTF-8 (but the
   >   converse is not true).
   >
   > *  The command iconv -f UTF-8 -t ASCII file will fail unless all the
   >    characters in file are already ASCII.  Hence it isn't a very useful
   >    command.
   >
   > * The coding standards say that we should prefer ASCII wherever
   >   possible.  If it is not possible, then we should use UTF-8.
   >
   > I think that Therese is saying that there are some files which are using
   > UTF-8 when ASCII would have sufficed.

   Thanks. This is exactly what I meant. (Thérèse forwarded my question 
   from webmasters to this list).

   ASCII is the common subset of both UTF-8 and most single-byte charsets, 
   specially ISO-8859-X. Coding 'maintain.txt' and 'maintain.info' in pure 
   ASCII (as was done until some time ago) makes them backwards compatible 
   with almost all machines and OSs at no cost.

Can you define what you mean with "pure ASCII"? What is "unpure
ASCII"?  Are you refering to ASCII (which is 7-bits) represented in a
8-bit byte, with the high bit high by unpure, and high bit low to be
pure?  I.e., #o200 >= #o400 being unpure chars, < #o200 as "pure"?


Initially you mentioned issues viewing things on terminals.  That has
little to do with not using the eighth bit in a byte, since about one
third of the ASCII table is not viewable on dumb terminals.  And many
such combinations are in the Info format, and have been since before
Info was used by the GNU project.  Can you clarify, give examples, etc
of what exactly you are having an issue with?

Seeing that UTF-8 is a subset of the ASCII table, so since they by
definition are compatible with any thing that represents a character
as a 8-bit byte I cannot see what system it wouldn't work on, so a
better description of where it doesn't work would be helpful to
understand the issue.  Can you clarify what issues you are seeing, on
what systems, and what architectures?

   This is specially important for 'maintain.info' because it can't be 
   converted (the tag table becomes incorrect). Any user of a single-byte 
   terminal will need to rebuild 'maintain.info' from source (as I need to 
   do to see it in one of my machines).

Why should it be converted? Info files are meant for an Info viewer,
it would be the task of the Info viewer to adjust its locale.  I don't
know what a single byte terminal is, but anything that represents a
character as a 8-bit byte will handle UTF-8 just fine -- this includes
ancient VT100's, so hearing where you have issues with UTF-8 would be
helpful.

/Alfred





reply via email to

[Prev in Thread] Current Thread [Next in Thread]