groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] Applications of \c in man pages in the wild


From: Ingo Schwarze
Subject: Re: [Groff] Applications of \c in man pages in the wild
Date: Thu, 27 Apr 2017 17:37:15 +0200
User-agent: Mutt/1.6.2 (2016-07-01)

Hi,

G. Branden Robinson wrote on Thu, Apr 27, 2017 at 09:46:27AM -0400:
> At 2017-04-27T14:30:33+0200, Ingo Schwarze wrote:

>> wow, you definitely demonstrate diligence in investigating existing
>> usage.
> [...]
>> That can be summarized as follows:  Legitimate hand-written use is
>> almost inexistent, even hand-written abuse is very rare, but unusually
>> frequent when expressed as a fraction of the instances of use.  The
>> dominant occurence is abuse by known-bad autogenerators.

> If this were a matter of preserving compatibility, you'd argue that
> breaking 14 man pages out of 7000 is too many.

Sorry, i don't follow.  Nobody suggested deleting \c from roff.

All i said regarding portability is that implemeting your .TP .itc
in groff, Heirloom, and mandoc and recommending use of \c after .TP
means that *all* pages doing that misrender on all systems using
either older versions of our formatters or legacy formatters.

If a significant language improvement would break 0.2% of existing
manuals, i'd certainly consider going ahead with it and doing the
handful of commits required / sending out the handful of patches
required, but i don't see how that is related to the discussion.

> Let's review, from earlier in the thread:
> 
> BR> What I'm hearing is that you feel that man(7) is a lost cause:
> BR>
> BR> A.  It desparately needs improvement because of its presentational,
> BR>     rather than semantic, focus.
> BR> B.  Improvement will break portability.
> BR>
> BR> This basically amounts to doing nothing and letting man(7) die of
> BR> its own accord.
> 
> IS> Exactly.
> 
>> Remeber that writing legacy man(7) documents is quite hard and
>> entices many people to try all kinds of (sometimes unavoidable,
>> sometimes ill-advised) trickery.  Even when analyzing the use of
>> easy-to-use languages in the wild, you usually find substantial (in
>> that case, needless) trickery.  So it is actually surprising that
>> you found so little.

> Yet, somehow, you do not regard this as evidence of the quality of
> man(7) nor of the people who write in it.

No, it is not, because you can find lots and lots of poorly written
man(7) documents in the wild, even hand-written ones.  All it tells
us is that \c is exceedingly rare in manual pages - both (arguable)
legitimate uses and abuses.  That doesn't imply there aren't other
common problems; in fact, there are.

>> So, if we would choose to promote \c use for the FONT_MACRO_C use
>> case, we would actually promote using a low-level feature

> It's no lower-level than the escapes to which you've given your
> blessing; see groff(7) and CSTR #54, where it has parity.

Admitted.  \f etc. is also low-level.  But it is simpler.

>> that is so far virtually unused in the wild,
 
> Hmm.  210 uses out of ~7,000 makes it about twice as popular as mdoc.
> 
> $ find man* -type f -and -not -type l | xargs zgrep '^\.Dd' | wc -l
> 103

Only if you don't count the about 3000 mdoc(7) manuals in a typical
*BSD base system.  If you count those, \c is about 2% (abuse by
docbook included), mdoc(7) is about 30%.

>> but where existing practice in the wild already demonstrates that
>> about 1 out of 3 users who choose to use it freely get it wrong.
>> Officially encouraging use is likely to cause an increase rather than
>> a decrease of abuse.

> That's conjecture.

Admitted.

> What I see is a propensity to cut-and-paste from examples that are
> believed to be good.  If we provide good examples, there's no reason
> to expect abuse to grow.
> 
> My expectations are modest: I think a few people will pick it up and use
> it correctly, most will continue to live with bad markup,

.TP
\fB\-scale\fP \fIxfac\fP[,\fIyfac\fP]

is not bad markup, but decent quality man(7) code that i would
recommend.

> and practically no one will see \c as a new toy to play with.

That might turn out to be true; it's also plausible conjecture,
even though you already reported \c cargo culting right now.

>> I think your analysis reinforces the argument that we should refrain
>> from promoting the use of escape seqeunces in manual pages unless
>> they are unavoidable, well-established, *and* easy to understand.
>> Typical examples of escapes matching those criteria are \e, \&, \f.

> \e is a bad example; in man(7) pages, people use it expecting to it to
> do the same thing as \(rs.  Which is fine until some clever wag
> redefines the escape character.
> 
> Most of the time, \(rs is what people mean, because--in man
> pages--they're discussing the backslash symbol outside of the context of
> *roff itself.  Almost always, they're telling C and shell programmers
> how to escape and quote things.

Here you probably have half a point.  I should probably go ahead
and change the mandoc documentation to recommend \(rs rather than
\e for newly written manual pages.

Scouring the existing manuals is maybe not worth the substatial
amount of work required, because there is no problem in practice:
if someone uses .ec *in a manual page* (seriously?) they are already
responsible for writing \ for \ after that, and "xe" for "x" if
they said, for example, ".ec x".

Besides, if somebody would dare to use .ec in an OpenBSD manual
page, they would be shot and killed almost instantly from multiple
directions, and i guess quite a few other communities would behave
in a similar way.

The reason why manual pages almost consistently use \e rather than \(rs
is that, as far as i see, \(rs did not exist in CSTR54, the beginnings
of many manual pages predate CSTR54, and the recommendation how to
encode backslashes in manual pages that one finds in documentation
predate CSTR54 in any case.

For example, even groff(7) still says:

  Printable backslashes must be denoted as \e.  To be more
  precise, \e represents the current escape character.
  To get a backslash glyph, use \(rs or \[rs].

That has already been improved substantially, but it could maybe
still be worded better.

> Regardless, \c obviously has a couple of indispensable use cases in the
> man(7) domain.  We should be documenting them, not ignoring them in
> hopes people will just throw up their hands and go use mdoc instead.

Document, sure.  Indispensable, no; i have seen none that \f cannot
solve, and \f seems simpler than \c.

>> It is now demonstrated in the wild that \c misses the last criterium,
>> and it is plausible to assume that \! will also miss it.

> I think you're indulging deeply in motivated reasoning in service of the
> objective we established earlier.  You don't want people to write man(7)
> pages well; you want them to write them badly, if they write them at
> all, in the hope that the wretchedness of the result will lead them to
> the true light of mdoc(7).

No, that's not what i want, and that would also be foolish.
Frustrating *users* with badly written manuals helps nothing.
Even frustrating *authors* does not help and does not force good
decisions: it would also increase the risk that authors abandon
manual pages (and roff) outright.  I think i even explicitly said
that helping people to write better man(7) is good.  There are
circumstances where writing man(7) is still required.  Project
policies are merely one example.

All i say is that man(7) will remain hard even with such help,
and that incompatible changes have more downsides in man(7)
than they have in mdoc(7).

> In any event, I checked out Heirloom troff.  Unfortunately its CVS
> repository has not seen a commit in almost 6 years.

You must have looked in the wrong place.  Try

  http://n-t-roff.github.io/heirloom/doctools.html
  https://github.com/n-t-roff/heirloom-doctools/

as linked from

  http://mdocml.bsd.lv/links.html

> Are we sure that Heirloom is still maintained?

Absolutely.  Carsten Kunze maintains it.  He is also one of the
most active groff committers and regularly follows this list.

> Is there any other living troff on the planet?

There is

  https://swtch.com/plan9port/

I think it is neither as good nor as vital as Heirloom, and i'm not
sure to which degree it is alive, but it does exist, and some
operating systems provide it in packaged form (even OpenBSD).  The
Plan9 roff does run on OpenBSD and it is possible to read manual
pages with it, but i neither claim that it works very well nor that
it is broken in any particular way, i simply never really looked
at it in sufficent detail to judge it.

Then there is the roff in commercial Solaris (maybe somewhat
related to Heirloom, but certainly not following current Heirloom
development closely).

AT&T DWB has also been kind of released, but i doubt that "alive"
is the right word to use for it - besides, it looks like AT&T
broke the URI once again:

  http://www2.research.att.com/sw/download

But Carsten Kunze checked it in here:

  https://github.com/n-t-roff/DWB3.3

Finally, there are a number of very obscure implementations,
maybe irrelevant, maybe not:

  http://repo.or.cz/w/neatroff.git = http://litcave.rudi.ir/
  http://www.nesssoftware.com/home/mwc/source.php
  ...

It looks like https://github.com/jgm/pandoc/ can only write,
but not parse manual pages.

It is likely that there are more implementations than i know of,
and some may well be alive.  Besides, i did not even try to list
ad-hoc non-roff man(7) parsers.

> If not, then I have to conclude that GNU troff _is_ the standard

I fully concur.  GNU troff has been the de-facto roff standard for
about 20 years now, in particular on all systems that are somehow
related to Linux or BSD.

> by dint of being the sole living implementation,

Not the sole, but the leading one by far.

> and therefore I'm not too worried about the impact of my proposal
> regarding TP, which Bjarni developed independently back in 2014,
> and with greater generality to boot.
> 
> Nevertheless, if someone can get Heirloom troff building on Debian
> stretch, I'm willing to run A/B comparisons to see what, if anything,
> breaks.

Heirloom and mandoc are not the problem.  If groff goes your way in
this respect, i will make mandoc follow, and it seems liekly to me
that Carsten will eventually make Heirloom follow as well.

> Out of those ~7000 pages, I haven't seen a single markup error that can
> be unquestionably laid at the feet of Bjarni's and my proposal.

Did you test your proposed new syntax (.TP using \c to include
two macro lines in the head) on

 - Oracle Solaris 9, 10, or 11
 - illumos
 - Plan9 (not sure whether that is still relevant)
 - DWB (may well be irrelevant by now)
 - Any other commercial system like HP-UX, AIX, etc.: i don't
   have the slightest idea what those systems use today.

> At least we can agree that docbook-to-man produces a true dog's
> breakfast of output.  Pretty lamentable.

Indeed.

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]