groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] mom : unicode in .INCLUDE'd files


From: Ralph Corderoy
Subject: Re: [Groff] mom : unicode in .INCLUDE'd files
Date: Sun, 23 Jul 2017 12:38:16 +0100

Hi Ted,

> Mike Bianchi wrote:
> > The thing I _like_ about the *nix OSs is they don't demand I
> > upconvert just because a "better way" comes along.

Thompson and Pike knew that when designing UTF-8;  it's a superset of
ASCII and ensures a zero byte only means NUL so C strings continue as
before.

> I completely agree with Mike! Of course it would be a good thing to
> *extend* groff's capabilities so that it can cope (optionally) with
> recent developments, but in my view it *must* keep its original
> capabilities, and those that have evolved since (say) the 1980s (which
> is where many of my own troff source files date back to).

Isn't it groff's evolution that's the problem here?  Bell Labs troff
took ASCII, i.e. 7-bit.  groff added ISO-8859-1 support, another ASCII
superset, that was still one byte per rune but used 96 of the
top-bit-set bytes for more runes.  UTF-8 comes along and groff can't
adopt it because it's already taken an incompatible fork.  IIRC Bell
Labs plough on with Plan 9's troff taking UTF-8.

How many of our old documents are ISO-8859-1 instead of ASCII?  Could we
wind back the clock and make UTF-8, and thus ASCII, groff's input, with
ISO-8859-1 being the runt input character set that needs options and
hoops to jump through?

-- 
Cheers, Ralph.
https://plus.google.com/+RalphCorderoy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]