groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

identifier length in AT&T and GNU troff


From: G. Branden Robinson
Subject: identifier length in AT&T and GNU troff
Date: Tue, 22 Mar 2022 21:05:48 +1100
User-agent: NeoMutt/20180716

Hi Ralph,

At 2022-03-21T10:33:20+0000, Ralph Corderoy wrote:
[I said:]
> > We get used to delimiters being paired.  :)
> 
> Depends on the delimiter: colon is an example, comma another.

Those are good examples of delimiters that pair with themselves, say in
ed(1) address expressions or sed(1) replacement operations.  From a
formal perspective, I'm not sure a "delimiter" that occurs only once in
an expression is worthy of that name, though it will likely be widely
understood in casual use.  If the distal end of a syntactical element is
recognized through an escape hatch to a higher scope of lexical analysis
(like a newline), then I would argue that a second "delimiter" does not
occur at all, as it does not in the font selection escape sequence in
"foo\f(BIbar".

My point is that out of the 94 visible code points in ASCII, only 8 are
reasonably construed as paired: < > ( ) [ ] { }.

Since the escape character itself was already taken, and it was sound
reasoning not to squat on the control characters either (though that did
in fact happen with the apostrophe), that left many remaining candidates
even among punctuation.  What I wonder is: why choose one of the paired
punctuation characters for a non-enclosing purpose?

> To those used to troff, before GNU arrived, \(lq is just read as a
> unit.  We do not think of it as an opening which requires a close.

No, but that is a bit of specialized knowledge you have to aquire.
Where else are you accustomed to using an opening parenthesis in this
way?

> The parenthesis immediately tells us the length of what follows.

So did Hollerith strings in FORTRAN 66--more flexibly, I add, and yet
they have fallen out of use.

> In contrast, a open bracket tends to be more heavier than the
> parenthesis and made much worse by the noisy closing one which is
> redundant in the common two-letter case.

I acknowledge that GNU troff has wandered a fair distance from the ideal
Huffman encoding for formatter instructions that AT&T troff approached.

GNU troff made its decision in favor of expressivity as reflected in
identifier length--that is, meaningful names for registers, strings, and
macros--at the outset.  groff has been around longer than AT&T troff had
been when groff was first written.

There is an unstated premise in your argument--could you make it
explicit for me so that I don't have to guess at it?

> Plus one must scan for the ], detracting from the flow of reading.

Is this a difficulty you also experience with C string literals?

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]