groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] [patch] modernize -T ascii rendering of opening single quote


From: Ingo Schwarze
Subject: Re: [groff] [patch] modernize -T ascii rendering of opening single quote
Date: Sat, 2 Feb 2019 14:57:59 +0100
User-agent: Mutt/1.8.0 (2017-02-23)

Hi,

Jeff Conrad wrote on Sat, Feb 02, 2019 at 05:46:59AM +0000:
> On Friday, February 1, 2019 3:31 PM, Ingo Schwarze wrote:

>> And the correct way to mark up a single-quoted string in low-level
>> roff(7) is \(oq...\(cq, with the rendering decided by the output
>> device.

> I think this gets to the essence of the matter.  The character table
> for -Tascii should recognize that the ASCII character set doesn't have
> opening or closing single quotes, and accordingly maps both to \(aq.
> In a sense, this is a glyph diddle, but it's one that, at least in my
> experience (and I go back to the actual typewriter), has been
> universally established practice.  The same cannot be said for mapping
> \(oq to \(ga, which strikes me as akin to treating O and 0 and
> l and 1 as interchangeable.
> 
> I think "modernise" is a misnomer here, because I suggest that the
> existing mapping isn't archaic; rather, it's always been wrong.

While this is an interesting argument, does add a new aspect to the
discussion, and opens up a new way to look at the conflict, see below,
i fear the statement above, as it stands, is incorrect.

A short and intriguing overview of the early history of ASCII 0x60
is given in: http://jkorpela.fi/latin1/ascii-hist.html#60

The first version of US-ASCII, ASA X3.4-1963,
had character position 0x60 "unassigned":
http://worldpowersystems.com/ARCHIVE/codes/X3.4-1963/page5.JPG

In the second version of US-ASCII, ASA X3.4-1965,
character position 0x60 was "@"
https://web.archive.org/web/20100116001012/http://homepages.cwi.nl/~dik/english/codes/stand.html#ascii

The third version of US-ASCII, ASA X3.4-1965,
seems to be the first having ` at 0x60 (same source as for -1965).

The latest US-ASCII standard, ANSI INCITS 4-1986 (R2007),

  http://sliderule.mraiow.com/w/images/7/73/ASCII.pdf

says on page 16;

  0x60 LEFT SINGLE QUOTATION MARK, GRAVE ACCENT

with this footnote:

  These characters should not be used in international interchange
  without determining that there is agreement between sender and
  recipient (see Section B5 in Appendix B).

which appears to go back to at least RFC 20 (yes, *twenty*), 1969:

  http://art.tools.ietf.org/html/rfc20  (page 5)


That said, nowadys, US-ASCII arguably remains most relevant becose
it was chosen as a basic for Unicode.

While there are cases where Unicode defines characters as ambiguous,
consider for example U+002D HYPHEN-MINUS, U+0060 is not defined as
ambiguous:

  https://www.unicode.org/Public/11.0.0/charts/CodeCharts.pdf

is very clear that U+0060 is a grave accent and *not* an opening
single quote.

While that of course cannot retroactively change what ASCII used
to define in the 1960ies to 1980ies, i do think an argument can be
made that there is value in discontinuing usage of ASCII that
conflicts with Unicode before we enter the third decade of the new
millenium.

[...]
> Is some history lost with the proposed changed?  Sure.  But is history
> the overarching consideration?  I suggest that it preferably should be
> getting the best typography possible with the ASCII character set.

That goal doesn't appear to bring us much closer to a decision:
While many fonts today clearly represent U+0060 as a grave accent,
some traditional fonts continue to support the usage of ASCII 0x60
as an opening single quote, and some members of this list clearly
stated that they like such fonts.  So for them, rendering \(oq
as ASCII 0x60 actually results in *better* typography than rendering
it as ASCII 0x27 APOSTROPHE-QUOTE.

However, i think that even Doug, Werner, and Ralph will have to admit
that from a typographical standpoint, such use of the ASCII 0x60 output
glyph is highly non-portable nowdays, and according to RFC 20, it
already used to be non-portable (and discouraged for international
use) in 1969.

So i think it is fair to make my wording

  "modernise"

more precise as follows:

  "Stamp out US-specific, internationally non-portable usage of ASCII 
   that is incompatible with Unicode, because nowadays, using ASCII
   in a way that is compatible with Unicode is more important than
   preserving historical -T ascii rendering practice and more
   important than rendering historical documents unchanged that
   incorrectly encode ASCII 0x60 (for example for use in m4)
   as \(oq rather than as \(ga."

Is that something we can agree on?

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]