groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: the Courier font family and nroff history


From: G. Branden Robinson
Subject: Re: the Courier font family and nroff history
Date: Fri, 22 Mar 2024 22:12:05 -0500

At 2024-03-22T17:06:40-0700, Russ Allbery wrote:
> "G. Branden Robinson" <g.branden.robinson@gmail.com> writes:
> 
> > That's a good argument against grotty(1) emitting overstriking
> > sequences, at least by default, and yet that the people swiftest to
> > anger on this subject argue _for_ it.
> 
> I'm not fully following this argument, but (assuming I've not
> completely lost the train of conversation), it may be relevant here
> that some years ago (it was in 2000, which surely was only five or six
> years ago) a contributor went to the trouble of writing
> Pod::Text::Overstrike to format POD output with backspacing with
> overstrike or underscores.  At the time, a version that used termcap
> already existed (and still does).

You prompted me to take a look around at podlators Git.  I didn't have
any idea this existed.  Neat!

> The stated reason was that the output was device-independent, unlike
> output that embeds formatting codes derived from device-specific
> termcap entries,

Okay...by this time groff had for about 10 years been producing
device-independent _terminal_ output from troff(1).  On the other, that
is its own peculiar little language.  Maybe the author just didn't want
to deal with *roff, or didn't want to count on GNU troff being
available.  (Kernighan didn't completely unify terminals under the
device-independent troff scheme presented in CSTR #54--nevertheless its
"driving tables" for terminal devices bore a startling resemblance to
"DESC" files for typesetting devices.)

> and they really liked the bold and underlining rather than the plain
> text or *ad hoc* markup produced by Pod::Text.

Part of me wants to yell "then why not just use nroff, for crying out
loud", but part of me understands the fun of finding one's own way.

> I know that to a first approximation all the world is now some
> variation of an imaginary VT100 terminal emulator, and thus one can
> usually blindly use SGR escape sequences and expect them to work in
> much the way that one can assume all programs only run on VMS.

I think that's a little unfair.  We can trace the history of these
escape sequences back to ANSI X3.64, which was later succeeded by
ECMA-48 and (equivalently, as far as I know) ISO 6429 and JIS X 0211.
These standards have been around approximately as long as Unix has been
something you were likely to run into at your university or workplace.

I would never advocate _blind_ usage of SGR or other ECMA-48 escape
sequences.  For SGR in particular, even termcap has a capability code:
"sa".  Programs, including GNU Bash and those in GNU Coreutils,
_should_, in my opinion, be using termcap or (preferably) terminfo to
look before they leap.  But the cryptic form of ECMA-48 escape sequences
has proven seductive to junior hackers (in mentality, if not always
chronological age) far and wide.  As soon as they can make the terminal
jump with "printf '\e[xx;yy\a'" they get completly carried away.

Often, the next day, the same person will, in a code review, confidently
and with no sense of irony, accuse your work of a "layering violation".

Really, there isn't a hand large enough to slap these people with.

But that's not the fault of ECMA-48, which has even had the virtue of
being freely available on the Web for many years.  We cannot say as much
for many ANSI or ISO/IEC standards.

> But I have occasionally had reports that Pod::Text::Overstrike is a
> better option for (some) Windows users because apparently their pager
> handled the overstriking but termcap (via the Perl Term::Cap module)
> wasn't available.

I'm no MS-DOS/Windows expert, but my understanding is that you couldn't
count on support for ECMA-48 at the DOS prompt (or equivalently in
CMD.EXE on NT-descended Windows) because the console driver didn't
recognize them.  However, if the user told CONFIG.SYS to load ANSI.SYS,
it would, because that module interposed itself before the BIOS call
that talked to the display, and interpreted them, driving the
CGA/EGA/VGA hardware appropriately.

God, I feel dirty talking about this crap.  I'm sorry I remember even
that much.

I have gathered, by reading bug fora and similar while trawling the
Internet for accounts of trouble with groff that people are too lazy to
actually report to us, that Windows 10 or 11 has a console
driver/terminal emulator that does "better" with ECMA-48 support.  I
haven't heard even a rumor of anything usefully quantitative, like a
table of its support for standardized escape sequences in comparison
with, say, xterm, or even the Linux kernel's somewhat wobbly virtual
console device.  But, supposedly, things are "better".

> I have no idea how dated this information is, having not used Windows
> myself in several decades, but I always found it interesting.  I've
> kept the module working all these years since it's not much additional
> effort.

No crime in that.  I keep a lot of ancient groff stuff in service too.

At 2024-03-22T21:08:39-0500, Dan Plassche wrote:
> Overstrikes are more easily filtered and transformed for other output
> formats than levels of nested escape codes that are terminal specific.

...yes, except when they're inherently ambiguous.

grotty(1):
       ... grotty overstrikes, representing a bold character c with
       the sequence “c BACKSPACE c”, an italic character c with the
       sequence “_ BACKSPACE c”, and bold italics with “_ BACKSPACE c
       BACKSPACE c”.  This rendering is inherently ambiguous when the
       character c is itself the underscore.

A bold-italic font was a pretty exotic thing to Bell Labs troff--so much
so that it didn't exist.  CSTR #54 [1976] documents four fonts, roman,
italic, bold, and "special": R, I, B, S.  (Hmm, now I'm hungry.)  I
figure this explains why the ambiguity never troubled them.  "BI" fonts
can, it seems, largely be traced to the impact of PostScript and its
base 14 fonts, which had 3 families in 4 styles and two symbol fonts.

> Enscript from Adobe, and the more featureful GNU replacement, are good
> examples of tools designed to work with nroff or other daisywheel/line
> printer output using overstrikes.   The preformatted line and page
> layout are fully retained with all overstrikes rendered properly and
> the ability to use any font (converted) in the postscript output,
> which is awesome for printing historical documents designed for nroff.
> You can also easily pass custom roff overstrikes to simulate combined
> typewriter characters beyond bold and underline.

Yes.  And accounts I've heard indicate that this was done even on
video terminals.  Not the character-cell sort, but the storage-tube
displays of the Tektronic 4014 and similar.  Seventh Edition Unix
shipped a tc(1) command to help you preview your troff output with that
device before you spent precious departmental money sending it to the
actual typesetter.

> I have no major objection to using escape sequences and agree they
> open some additional possibilities for functionality in modern
> terminal emulators.  However, I think that most people using
> overstrikes  have less as the pager in raw mode where underlines and
> bold display correctly for manual pages.

Yes, and this feature of less(1) has so badly misled people that they
think it's what nroff should have been producing all along.  But if you
used nroff at the Labs in 1978, on your Teletype terminal, you didn't
have to pipe nroff output through anything.  It produced correct markup
appropriate to your terminal directly.  That's what it should have
_kept_ doing, but for the damnable corporate and human factors I
explored earlier in the thread.

You shouldn't _have_ to page a program's output to _get_ correct output.

If someone's not convinced yet, or is simply entertained by seeing me
fulminate about this, here you go:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=312935

> It's a shame that early pc vga consoles did not display underlines or
> italics properly!

Very much for the former.  A lack of proper italics, at the resolutions
we were using for text mode fonts in those days, I find more excusable.
Even now, in an Xft-leveraging, client-side-font-rendering xterm, a
rectangle is a pretty tight squeeze for an italic capital M.  I live
with it.

I wonder if anyone has attacked this problem by writing a terminal
widget that renders glyphs into parallelograms.

> Most other *nix platforms did, and that's really not a problem in X or
> modern graphical consoles like wscons on NetBSD that display
> overstrikes correclty.

With correct terminfo(3) descriptions, these should Just Work when I've
finished merging/doing violence to Lennart's grotty-terminfo patch.

I expect to keep grotty's `-c` option and `GROFF_NO_SGR` environment
variable support around forever.  Not just because the sort of people I
complain about above will pool their Bitcoins to hire a ninja assassin
to kill me if I don't, but because it will remain important, for
regression testing, to simulate "old school" output without having to
tediously manage chroots, VMs, or ancient packages.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]