groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Plan 9 man added a new macro for man page references


From: Ingo Schwarze
Subject: Re: Plan 9 man added a new macro for man page references
Date: Sun, 1 Aug 2021 15:49:25 +0200
User-agent: Mutt/1.12.2 (2019-09-21)

Hi Branden,

note that mdoc(7) has most of what you are talking about - not just
as a freshly invented concept yet to be tested, but actively used
and proven adequate in practice.  In particular the .Xr macro has
seen consititent use in all mdoc(7) manual pages for more than 30 years,
and it has been in use for hyperlinking on the web for about ten
years, or even much longer if you count the original FreeBSD man.cgi
implementation.  It has exactly the syntax you propose for .MR:

  .Xr page_name section_number [punctuation_suffix_args]


G. Branden Robinson wrote on Sun, Aug 01, 2021 at 09:09:39PM +1000:

> An idea in my head.  It's some text that would "deeply" link into a man
> page document, like a section or subsection heading, but possibly more
> precise than that.

I first discussed that idea during EuroBSDCon 2015 in Stockholm:

  https://www.openbsd.org/papers/eurobsdcon2015-mandoc.pdf
  see pages 15 to 18

It turns out the concept of remote deep linking in manual pages is
rarely needed, for several reasosn.

Well-designed programs tend to be simple, doing one thing well.
Consequently, well-written manual pages for well-designed programs
tend to be short.  When linking to a short document, deep linking
matters little.  Besides, deep linking is not necessarily beneficial.
The reader being refered to that other page needs to grasp some context
regarding what that other page is about, and that is easiest to get
from the page title, the Synopsis section, and the first sentence of
the Description section - i.e. from the beginning of the page.
Being plunged right into the middle of a document is not always
helpful, *especially* when the document is large or complex.

You might object that large and complex software systems do exist
and necessarily have complex documentation.  For example, groff
itself.  But when well-designed, they tend to be modular, and so
is their documentation, again making individual manual pages
reasonably short and limiting the benefit of deep linking.

Off the top of my head, i would estimate that 95-98% of .Xr / .MR
links in manual pages even not counting those in See Also sections
don't need deep linking in the first place because they refer to
the other page as a whole.  Of the remaining 2-5%, an estimated
80-90% would benefit little because the target page is short anyway.
So we are talking about a one permille to one percent use case.
Assuming a typical operating system sized manual of maybe 3000-5000
pages and an average of 3-5 .Xr links per page outside the See Also
section, that's about 10-25k links or about 20 to 200 links that
might benefit substantially from deep linking.  All rough estimates
without doing any actual counting.

I'm not saying deep linking is completely irrelevant or i would not
habe been considering and discussing it for the last six years.
But i do insist that it must not dominate the discussion of linking
as a whole.  It is *much* more important that simple use cases of
linking work in a way as simple as possible for authors and readers
than that deep linking is available.  In other words, designing deep
linking must not spoil the overall design of linking.  There is a
very substantial danger of overengineering here.

The very limited usefulness and the very real danger of overengineering
is why i still don't support deep linking in mandoc, even after mulling
over it for more than half a decade.  There may be a way that is
non-intrusive and occasionally useful, but it's not trivial to see.

> It would be "hidden" because it would not be visibly
> rendered (by default) on a hyperlinking output device.  In a generated
> URI, it might be like <man:/usr/share/man/man2/membarrer.2#THIS>.
> 
> However, Russ Cox of Plan 9 troff pointed out, and I think implemented,
> a third argument that is text to be interpolated immediately after the
> manual reference--much like the groff man(7) .ME and .UE macros.

Note that the concept of trailing punctuation arguments is standard
for mdoc(7) macros but feels somewhat alien to the man(7) macros.

> That is not fatal to my evil plans; the internal anchor reference could
> be appended to the section somehow,

I strongly advise against that.  Combining arguments of different
purpose into a single function argument is terrible practice in the
first place.  It doubles complexity because without it, you have one
level of parsing: identify arguments and use them.  Now, you suddenly
have two levels of parsing: after identifying this kind of argument,
you have to start a whole new parsing algorithm to parse that argument
and then handle its components.  It also insults the eye by non-uniformity
of syntax.  Before, you had the space character as an argument separator.
Now, you suddenly have two different separators for no good reason.

Besides, combining the section number and the deep link target name
makes no sense at all because both are completely unrelated to each
other.  The arguments describing the target form a natural hierarchy:

 1. target section
 2. target manual page name (within the target section)
 3. deep linking target (within the target page)

In spite of this natural ordering, starting with the page name is
good because that's what authors and readers should think about
first and also because we certainly mustn't abandon the name(sec)
output convention, so having the same argument order in the .Xr
and .MR macros on the input side really helps sporadic authors
to remember the input syntax.

As i said on page 18 in 2015:

  link distance:
    carefully design a concept for remote deep links
    - then use that for .Xr to specifc .Sh/.Ss
    - then use that for .Xr to other, e.g. implicit targets

I did not mention the idea of adding a third argument to .Xr / .MR
because it is blatantly obvious that's the first idea that springs
to the mind.  But that doesn't imply it's the right thing to do.
Adding an argument is the lazy and uncreative guys' answer to *any*
task.  I'm neither claiming it is right nor it is wrong in this
particular case; i haven't made up my mind yet, the reasons being
the scarcity of real-world use-cases and the relative youth of the
concept of "implicit targets"; i suspect it might be wise to let
the concept of implicit targets mature a few more years through
practical use before making up our minds whether a special syntax
for manual specification of deep linking targets is needed, and if
so, how it should look like.

> or as, God forbid, an optional fourth argument.

Well, it must of course not come after the punctuation argument,
the obvious syntax would be

  .Xr/.MR page sec [deep_target] [punctuation_suffix_args]

And in the extremely unusual case that some punctuation_suffix_arg
would not look like punctuation, you would have to write

  .ME page sec "" [punctuation_suffix_args]

In mdoc(7), that cannot ever happen because mdoc(7) very specifically
defines what closing punctuation is, and none of that can possibly
occur as a deep_target:

  https://man.openbsd.org/mdoc.7#Delimiters

An alternative would be using the .Tg macro that already exists in
the mdoc(7) language for a related purpose, as follows:

  .Tg deep_target
  .Xr page sec [punctuation_suffix_args]

The purpose of .Tg is to mark the next token as a link target.
Since .Xr can never be a useful link target, letting the deep_target
name refer to the *target* page rather than the *source* page when .Tg
precedes .Xr feels kind of natural.

This .Tg / .Xr design provides the side benefit of not changing the
syntax of .Xr that has been established for three decades, so it has
better backward compatibility properties than the three-argument idea.
Again, i'm not claiming just yet this is the best idea.

> But I'd prefer perverting the second argument to keep
> things close together, like this.
> 
>       For more on what can go wrong you when you screw up concurrency,
>       see
>       .MR membarrier 2#Errors .
> 
> or
> 
>       For more on what can go wrong you when you screw up concurrency,
>       see
>       .MR membarrier "2 Errors" .

Please.  Don't.

> > > * Added support for another string, perhaps 'MB' ("manref base"?),
> > >   supplying a base URL which can be set at page-generation time.
> > >   Embedding a full URL in man pages sources to an inherently
> > >   relocatable page hierarchy is a bad idea.

That feels like a feature for the formatter, *not* a feature for the
markup language.

  https://man.openbsd.org/mandoc.1#man~2

Note that the mdoc(7) documentation is not encumbered by this -O man=...
feature of the mandoc(1) formatter at all.

> "30 years after Sir Tim Berners-Lee brought you HTML, groff is hot on
> his heels!"

And about 33 years after Cynthia Livingston invented .Xr on behalf of
USENIX, and 12 years after Kristaps implemented .Xr / <A HREF>
support in mandoc -T html:

  https://cvsweb.bsd.lv/mandoc/html.c#rev1.30

> Well, maybe not.  You can see why I'm not in sales.

"Dear IRS, I sold 42 copies of GNU troff last year
 and made 0 dollars off it."

> > > [3] Going back to the Ur-source of all correct practice, Version 7
> > >     Unix, is not as dispositive as it might be.  Of the 641
> > >     cross-references I count in its corpus, only 345 (53.8%) set the
> > >     page name in italics.  The remainder simply use roman.  The
> > >     barbarism of setting the parenthetical section number in bold or
> > >     italics is not in evidence.

> I should go ahead and mention that I'm resolved to implement a string
> called (probably) MF, so make the font used for setting man page names
> configurable at rendering time.

Don't.  When designing a hammer, don't add bells and whistles as
features to it.  User-configurable fonts in manual pages provide
no benefit whatsoever, just like bells and whistles on a hammer
wouldn't, so the hammer is better without them.  But making this
user-configurabe has a clear downside: it reduces the uniformity
of rendered manual pages, to the detriment of users who would have
a harder time of getting used to how manual pages look like and
what the fonts used in them mean.  Almost no user would configure
this themselves, but package maintainers in operating systems would
be likely to fiddle with it.  So you would actively encourage
incompatibility across operating systems.

Making this user-configurable would feel like design by committee:
The committee couldn't agree on which of the equivalent colours to
use for the bikeshed, so they required the construction of multiple
bikesheds in various colours, and while they were about it, neglected
to consider the features of the bikesheds that actually matter to
their users.

Regarding which colour is best, at the risk of repeating myself:

UNIX-7 is inconsistent in this respect, in part I(R), in part R(R).
Linux is inconsistent, in part I(R), in part B(R).

BSD has been completely consistent for 30 years: R(R).

I claim I(R) is outright misleading because manual pages mostly
reserve italic for placeholders, for words the user needs to replace
with their own content - plus relatively few unrelated, general
typesetting features like stress emphasis.

B(R) is clearly better because manual pages mostly use bold face for
keywords and other fixed strings the user has to type verbatim,
and page names, just like command names, are fixed strings, not
placeholders.

But manual page markup tends to be heavy on the eye anyway, with lots
of unavoidable bold face and italics.  Where bold face and italics add
no benefit, they should consequently be avoided, for better aesthetic
effect and for reducing distraction of the eye.

For name(section) manual page references, bold or italic is just
not needed.  The name(section) syntax is very iconic and readily
recognizeable on its own, so using R(R) is clearly best.

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]