groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [htmlxref.cnf] Please update link to the Groff manual


From: G. Branden Robinson
Subject: Re: [htmlxref.cnf] Please update link to the Groff manual
Date: Fri, 6 Oct 2023 16:14:39 -0500

Hi Gavin, Ingo, Dave, and Thérèse,

Thanks to all of you for your feedback, patience, and advice.

At 2023-09-30T20:10:01+0100, Gavin Smith wrote:
> On Sat, Sep 30, 2023 at 01:15:09PM -0500, G. Branden Robinson wrote:
> >     As for why I chose the name I did, that was simply an ab initio
> >     selection for congruence with "groff.pdf", "groff.txt",
> >     "groff.dvi", "groff.info*", and "groff.html" (the "one big page"
> >     version).
> 
> It's no problem for us to change htmlxref.cnf to refer to wherever you
> are putting the manual.  It is not that html_node is better than any
> other name for a directory, other than that is the name that was used
> before, so the name that would be used in pre-existing hyperlinks.

Okay.  I'm leaning toward retaining it for the reason I articulated to
Ingo earlier in the thread: groff ships _multiple_ manuals, and with any
luck that number will increase in the future.  While I wouldn't bet that
there will be further groff manuals in Texinfo format--since as Larry
McVoy often points out, groff is perfectly capable of typesetting
material about itself--one can perhaps perceive the potential for
ambiguity.  And I decline to decree what some future contributor or
maintainer will or won't do.

> I see, I'd assumed it was from gendocs.sh as the format is similar.
> I'm not recommending that you use gendocs.sh and what you have seems
> just as good.

I think it's likely that at some point, the groff manuals index.html
_did_ come from gendocs.sh.

> > May I ask where this htmlxref.cnf file comes from?  What software
> > project houses it?
> 
> htmlxref.cnf is distributed with Texinfo and is also to be downloaded
> from https://ftp.gnu.org/gnu/texinfo/.  It gives URLs of Texinfo
> manuals on the World Wide Web.

Aha!  Thanks for shedding this light on it.

> texi2any uses this file when generating hyperlinks to other Texinfo
> manuals.  It's there to make inter-manual links work, which is an oft
> forgotten and neglected topic.

You have my sympathy.  We feel some of that pain in groff too.

> > 4.  "They might have changed this by mistake."
> > 
> >     Sort of.  I find the "html_node" name uglier, but if there's popular
> >     demand to switch it (back), I can see doing that for groff 1.24.
> 
> We'll change htmlxref.cnf to whichever URLs you decide to use going forward.

Okay.  Will advise when I get things sorted out.

> If there are no links to the groff Texinfo HTML manuals anywhere on
> the web, it doesn't matter, but it is likely there are at least some
> somewhere.

Yeah, Ingo found one or two.  Reports of groff's death, like Texinfo's
before you started contributing to it, are prone to exaggerations.  ;-)

> If we change it now to the new location, and then you change it back
> afterwards, then any new links generated in the meantime will end up
> pointing to the wrong place.  So please advise what you would have us
> do.

Maintenance of a central registry is a tricky thing.  (groff has that
same problem with fonts for our PostScript and PDF output drivers.)
Have you considered an alternative, like a directory where each manual
can deposit its own file characterizing its canonical location?  (This
is just me indulging in system design spitballing--not strictly on
topic.)

> > [5] Incidentally, "GROFF_SGR" (cf. "GROFF_NO_SGR") is now dead for
> >     real in Debian testing/unstable.
> > 
> >     "Adopt upstream's use of SGR escape sequences for man/mdoc (LP:
> >     #610609).  I turned these off for Debian in 2002 because pagers
> >     didn't cope well at the time, but it's now 21 years later and
> >     things have changed; SGR escape sequences resolve some ambiguity
> >     (see #963490) and are required for new features such as
> >     clickable hyperlinks."
> > 
> >     https://tracker.debian.org/media/packages/g/groff/changelog-1.23.0-2
> > 
> 
> Interesting.  I hadn't known about this.
> 
> https://salsa.debian.org/debian/groff/-/commit/f0a34f20ff772f692255b7e32a05630c639f75a8
> https://bugs.launchpad.net/ubuntu/+source/groff/+bug/610609

It may be that GROFF_SGR departs as mysteriously as it arrived...

At 2023-10-01T12:45:22+0200, Ingo Schwarze wrote:
> [dropping the external Cc:s to avoid boring uninvolved parties]

I pulled 'em back in like Michael Corleone.

> G. Branden Robinson wrote on Sat, Sep 30, 2023 at 03:59:13PM -0500:
> > At 2023-09-30T22:07:44+0200, Ingo Schwarze wrote:
> >>   https://uu.diva-portal.org/smash/get/diva2:1189607/FULLTEXT01.pdf
> 
> > This link was of particular interest.  It praised groff 1.22.3's
> > small size and high speed, but expressed significant frustration
> > with its documentation.
> 
> That's very weird because the quality of groff documentation was
> already excellent, and way above the average quality of software
> documentation, even before you started working on it.  Werner Lemberg
> and others did an outstanding job on it.

I certainly don't want to disparage the work Trent and Werner in
particular did--in fact, they did such a good job that I was able to
start learning groff instead of wandering away in frustration six years
ago.  But I think the status quo in 1.22.3 was a bit rough for rank
beginners.  For 1.22.4, I had an objective of putting a document in man
page authors' hands that would bring up to speed--enough to write a man
page competently--from nothing.  For 1.23.0, I felt the next step was to
branch this strategy out with respect to two audiences: macro package
users, and the brave souls would wanted to go straight to the formatter
and learn everything about it.

The fruits of those efforts were Larry Kollar's resurrected ms.ms
manual, and new material in the "gtroff Reference" chapter of our
Texinfo manual, much of which also landed in our roff(7) or groff(7) man
pages.  (I keep telling Larry McVoy about the last fact, but he hasn't
yet acknowledged it.  And certainly there is more to be done on
groff(7), particularly in terms of organization.)

Post-1.23.0, I've continued to polish the introductory chapters of
groff's Texinfo manual.

Granted, some people prefer, and will do fine with, CSTR #54 (but beware
the errata that are proven to exist, and are documented in groff's
Texinfo manual) and groff_diff(7).  That's fine, but I don't think they
are groff's entire audience.

> is definitely orders of magnitude better than LaTeX documentation -
> and yes, i have worked a lot with LaTeX, including professionally
> in academic settings.  LaTeX documentation is scattered all over the
> place, almost impossible to search through, of widely varying quality
> depending on the component, and the system as a whole is generally
> almost impossible to use without refering to non-free sources
> like Lamport's books - which i generally remember as aiming for a
> loose tutorial-style approach, lacking the completeness, rigour, and
> conciseness that you get when you use the groff texinfo manual together
> with the relevant manual pages.

[begin digression]

I found both Lamport's introductory LaTeX book and Knuth's TeXbook
difficult in terms of acquisition.  They're both packed away and in
storage now (though I have a digital copy of the latter somewhere in
$HOME), so it's hard for me to say what the problem was.  Maybe I knew
too little about typesetting in the first place, so I didn't understand
the nature of the problems they were trying to solve, whereas their
audience was largely people who'd had to deal with academic or
professional publishers who'd tie them up with endless revision cycles
complaining about typesetting minutiae.

For years I thought I was just too dumb to understand the revered genius
Knuth, but at one point when I picked up volume 1 of TAOCP, I was
enthralled.  Maybe by then I had a level of what some people call
"mathematical maturity", and was still lacking a corresponding amount
with respect to typesetting.

But I always hated Microsoft Word.  It was obvious to me from day one
(for me, the 1990s) that WYSIWYG word processors were no way to run a
railroad.  The first "word processor" I ever used in anger was back in
the 1980s, called "T/S Word", running on an operating system called
OS-9.  There are some ironies in these facts when one realizes what they
were cloning.

> Yes, criticising the fragmentation between the texinfo manual and the
> manual pages is a valid point, but a very minor one, given that we are
> only talking about two sources.  For LaTeX, fragmentation of
> documentation is much worse.

One might accuse me of fragmenting some of groff's documentation.  I'll
save my defense for my response to Dave, below.

[end digression]

[...]
> The point is that the URI is in use across a wide array of media
> from diverse sources.

Agreed.

> That commercial organizations generally do lots of stupid things that
> are not in the public interest (nor in their own interest really)
> isn't all that surprising.  As you say, no need to emulate
> corporate stupidity in the free software world, right?

Indeed not.

> > I concede that having a working "/html_node/" URL by hook or by
> > crook (or by symlink) is probably a good idea given the list of URLs
> > linking to it that you presented above.
> 
> Sure, you can keep both URIs indefinitely if you want.
> 
> There are samll downsides to having multiple redundant URIs for
> the same resource, like higher maintenance effort and more potential
> for confusion among users, so i generally try to keep the best URI
> and slowly phase out the others (which usually takes many years)
> but that's probably not a big deal.
> 
> With a URI component as firmly entrenched as /html_node/, phasing out
> is likely no longer possible, even if you have a decade to spare for
> the transition time, but for the newish /groff.html.node/, phasing
> out may still be possible if you care about consistency.

I have to say I really like the look of what Autoconf has done.  It
promises a lot of stability, and recourse for people frustrated by
versioning differences.  But I do want to elaborate their approach to
accommodate the presence of multiple manuals (and different sets thereof
for different releases).

At 2023-10-02T20:37:45+0100, Gavin Smith wrote:
> You do not need to ask GNU site admins to set up redirects for you.
> You can do it with the @anchor command in Texinfo.  From Info node
> "(texinfo)@anchor":
> 
>   ... when you delete or rename a node, it is usually a good idea to
>   define an ‘@anchor’ with the old name.  That way, any links to the
>   old node, whether from other Texinfo manuals or general web pages,
>   keep working.
> 
> https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040anchor.html

Thanks, Gavin!  I never did read the Texinfo 5.0 manual from start to
finish as I had intended to, and I would have saved myself some grief if
I had.  Too bad the 1.23.0 manual is already damaged in this respect,
but I reckon we can fix it if we do a 1.23.1.

It sounds like what I need to do is produce a list of all node names in
the 1.22.4 manual that no longer exist in 1.23.0's manual, and add
@anchor points for each.

At 2023-10-02T18:10:11-0500, Dave Kemper wrote:
> On 9/30/23, G. Branden Robinson <g.branden.robinson@gmail.com> wrote:
> > I've been working for a while on paring the groff Texinfo manual's
> > scope
> [...]
> > though I think I want to retain the comprehensive survey of ms, in a
> > reversal from my thoughts in 2021.
> 
> What caused the reversal?  It's long seemed weird to me to treat ms
> differently from the other macro packages, which are all documented
> outside the core groff documentation, and as a non-ms user I've taken
> to excising the -ms portions of the manual to avoid getting
> irrelevant-to-me hits in searches.  I was mildly looking forward to
> being able to jettison that script. (:  But more significant from a
> maintenance persective would seem to be the perpetual overhead of
> keeping two copies of the documentation in sync.

What prompted my reversal was revision of chapter 3, now titled
"Tutorial for Macro Package Users".  I found that there was no way to
present any of the macro-package-level concepts being introduced
concretely, in terms of syntax, because every macro package has
different macro names for them (or none at all, for features they don't
support).

One approach would present lists of applicable macros for every section.
But that's _more_ duplication_ not less.  Indeed, I can dash off a table
covering the topics in that chapter for man, mdoc, me, mm, and ms off
the top of my head (with "n/a" where necessary).  But--I am a thoroughly
incompetent mom(7) user, because I never have to write any documentation
for or fix bugs in her.

Pedagogically, it is important for a segment of the audience to see
concrete examples.  So it occurred to me that the existing material in
the very next chapter presenting ms(7) would serve that need just fine.

It is true that I could just point people to the ms.ms document.  And
your point about duplication is a valid one; it's challenging enough
maintaining groff_ms(7) as a stripped-down parallel of ms.ms.

Another way I want to make that chapter leaner is to take out the stuff
about DEC Ultrix man(7) extensions.  There are several reasons not to
have that in there.

It might help me to hear from additional actual readers of groff's
Texinfo manual.

At 2023-10-05T18:58:32+0200, Ingo Schwarze wrote:
> G. Branden Robinson wrote on Sun, Oct 01, 2023 at 06:53:30PM -0500:
> 
> > So while changing the name of the directory back to html_node will
> > fix some broken link problems, it won't fix them all, and it won't
> > be robust in the face of future development.  I'm fairly neutral on
> > the "html_node" vs. "groff.html.node" naming issue, but I'm
> > downright _opposed_ to limiting my (or future contributors')
> > flexibility in updating, expanding, reducing, or otherwise mutating
> > the node names of the groff Texinfo manual.  Those shackles are much
> > too tight.
> 
> Agreed.  Of course changing the content of documentation must always
> be possible, including removing obsolete content.  Renaming nodes
> may occasionally make sense, too.

Node renaming is nearly unavoidable if you break a composite topic into
components.  One of the heads I still have yet to mount on my wall is
"Using Symbols", which (IMO) badly needs a rewrite.  The reason I
hadn't done it yet is the usual one--I didn't understand GNU troff well
enough.  But I've learned a lot this year, a result of grappling with
issues like implementing a string iterator (still not done),
interrogating how and why we need both `tr` and `char`, and exploring
whether the former should be a property of the *roff environment.

> > A.  Put the groff 1.22.4 manual back online, probably as
> > https://www.gnu.org/software/groff/manual/groff-1.22.4/html_node/
> 
> While that is unlikely to do much harm, i'm not sure it is needed.
> I don't think we encourige using old versions of groff, so it is
> unlikely to help normal users.  It may occasionally be useful for
> people researching the history of groff, though not all that much
> because git serves that purpose better.  It may occasionally
> contribute to confusion when search engines return deep links
> into old documentation to unsuspecting users.  Not a big deal
> either way, i guess.

Fair.  And while I admire Autoconf's solution to the problem, they (1)
have only one manual to worry about; (2) they're much more critical
as a development tool than groff is; and (3) they're far more
susceptible to external forces driving changes (because operating
systems and runtime libraries introduce new features and, mainly, bugs,
all the time) than we are.  When groff changes something, it's generally
because we choose to, not because the operating environment forces us to
cope with an altered reality.

> > ...and have
> > https://www.gnu.org/software/groff/manual/html_node/
> > symlink/redirect to it.
> 
> I don't really like that idea.
> 
> Many old web pages talk about groff in general rather than about
> specific historical versions of groff.  So being pointed at the
> current documentation is likely more useful for most users than
> being pointed at documentation for some historical version.
> Besides, even if a site talks about a definite version of groff,
> that's unlikely to be specifically 1.22.4.

This is a good point.  I do feel kind of queasy about sending people to
the 1.22.4 manual when 1.23.0's will serve as well or better (which it
does, for every purpose I can think of, and will until and unless we
drop the comprehensive ms(7) coverage).

One could, of course, assert that I prefer the 1.23.0 manual because my
name is on it.

> Even if a deep link from an old website dies because the content
> of groff documentation changes, i don't think that is necessarly
> a bad thing.  It may alert the user following the link that the
> underlying functionality of groff in the region the website talks
> about has likely evolved.

Also fair.

> That doesn't mean links to the top level of the manual should break,
> unless we are planning to abandon or rename groff as a whole.  ;-)

Doesn't seem likely.  ;-)

> Please don't overthink all this.
[...]
> The general rule "if you care about the reliability of your links,
> don't link more deeply than you have good reasons to", on the other
> hand, is not limited to suits.  I try to abide by that rule, too.

I find this a sound principle.

At 2023-10-06T10:31:55+0200, Thérèse Godefroy wrote:
> There are other ways to redirect the html_node directory without any
> help from sysadmins or webmasters.
> 
> * Create an .htaccess file, for example in the manual directory:
> 
>     RedirectMatch "html_node((/.*)?)$"
> /software/groff/manual/groff.html.node$1
> 
> The URL is redirected to something like this:
> https://www.gnu.org/software/groff/manual/groff.html.node/*.html
> 
> * Add a line to the existing .symlinks file at the root of the groff
> directory:
> 
>     manual/groff.html.node manual/html_node
> 
> This symlink is processed into a rewrite directive:
> RewriteRule ^/savannah-checkouts/gnu/groff/manual/html_node((/.*)?)$ \
>   /savannah-checkouts/gnu/groff/manual/groff.html.node$1 [R=302,L]
> 
> I think the symlink method is not as clean as .htaccess because it
> replaces the standard path (gnu.org/software/groff/manual/...) with the
> actual location of the manual, possibly confusing visitors:
> https://www.gnu.org/savannah-checkouts/gnu/groff/manual/groff.html.node/*.html

Thanks a lot for these pointers, Thérèse.  It has been probably 15 years
since I messed with an .htaccess file.  I started my career in web
development for e-commerce, was traumatized by an utterly horrible
language called PHP (3.x!), saw that the exciting new thing was
_another_ utterly horrible language called Javascript,[1] and ran as
fast as I could into the warm embrace of systems programming.

Regards,
Branden

[1] ECMA has made Javascript much less horrible.  But for me, the
    language will always carry the shame of its origins.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]