groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [groff] Regularize (sub)section cross references.


From: Ingo Schwarze
Subject: Re: [groff] Regularize (sub)section cross references.
Date: Mon, 17 Dec 2018 19:40:54 +0100
User-agent: Mutt/1.8.0 (2017-02-23)

Hi,

G. Branden Robinson wrote on Mon, Dec 17, 2018 at 12:03:21PM -0500:
> At 2018-12-17T08:28:07+0100, Pierre-Jean Fichet wrote:
>> "G. Branden Robinson" <address@hidden> wrote:

>>> +.ie \\$1 .tr aAbBcCdDeEfFgGhHiIjJkKlLmMnNoOpPqQrRsStTuUvVwWxXyYzZ
>>> +.el      .tr aabbccddeeffgghhiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz

>> The problem with this, is that it ignores all but english languages.

> Yup.  I'm aware of that, which is why I did not propose it as an
> actual patch.  It's just a proof of concept, and it does work, as far as
> it goes--which is not far enough for non-English languages or the rare
> occasions when English words avail themselves of diacritics.
[...]
> The biggest problem I know of is that the uppercasing transform of
> German sharp S "ß" goes to "SS".  (A recent version of Unicode did
> introduce a capital sharp S but it might have only specialized uses; I'm
> not sure all Germans would find it acceptable.)

I'm a native German speaker and i wouldn't worry about that at all.
It would be good enough to just ignore that problem, it is really a
fringe case.

First of all, German manual pages are practically irrelevant because
in Germany, practically all children (with very few exceptions in a
few schools in a few tiny border regions to France at certain times -
and yes, i'm currently living in one of these corners) have been
required to learn English in school for many decades now.  Usually,
children have been starting to learn English when 10 years old,
sometimes when 6 or 12 years old, and instruction in the English
language almost never stops before the youths are at least 15 years
old, and most continue studying English until the end of school.
So people speaking German but not English are rare, and many of
those are now over 80 years old.  So if you can understand German,
you are almost always much better off reading the English manual
pages because they are almost always more up-to-date and contain
fewer errors; besides, even for native German speakers, English
computer science terminology is usually *much* easier to understand
than German translations, which invariably sound weird and often
leave room for doubt what the translated terms are really supposed
to mean.

By now, translating manual pages to German is almost as absurd as
translating them to Swedish or Dutch, where knowledge of English
is even more prevalent.

By the way, every time i talked to Japanese people about what is
needed to make mandoc better for formatting Japanese manual pages -
remember that in Japan, there are lots of people who do not speak
English at all - the answer invariably was similar to: "Don't worry
that much, Japanese manual pages are rarely properly maintained.
Very young Japanese people sometimes try to read them because they
feel uncomfortable with English, but they all soon understand that
it's a bad idea to rely on Japanese manuals and decide to improve
their English reading skills instead, which is very useful anyway.
So worrying about details of mandoc support for Japanese is mostly
a waste of time."

Besides, having a small esszett at the end of an all-caps word in
a manual page title is almost a non-issue in the first place.
Slightly ugly maybe, but so what?

Of couse, it may matter when trying to beautifully typeset poetry
in PostScript or PDF, but for a manual page, oh well...

> A 1-to-2 character mapping of course is beyond the ability of .tr.
> 
> As I think John Gardner said, what we really need is a roff request to
> expose the underlying C library's toupper() and tolower() functions.
> 
> A good feature to precede this one into 1.22.5, perhaps?

Using toupper(3) is totally fine.

What John said, though, was something very different, and i strongly
object to what he said.  He said that .tr should attempt to emulate
the behaviour of the POSIX tr(1) utility.  No, it should absolutely
not attempt that.

During the recent OpenBSD hackathon in Ljubljana, i spent an hour or
two with Martijn van Duren in order to figure out what needs to be
done to make the OpenBSD tr(1) utility POSIX-compliant.

The conclusion at the end of these one or two hours was that merely
drafting a complete list of problems and tasks in tr(1) would already
cause an excessive amount of work.  It was clear from the start
that full POSIX compliance is impossible to implement - in partcular
the feature John suggested to use, collation support, is terribly
hard to implement and not feasible without excessive complexity.
Even FreeBSD implemented that only two years ago, and i heard much
groaning and cursing from the poor guys who had to do it.  OpenBSD
will definitely not implement that insanity.  But what did surprise
us is that even when taking it for granted that collation support
is not feasible, making a complete draft of how to implement the
rest of POSIX tr(1), i.e. all the parts feasible without collation
support, is also quite hard, and implementing even that rest only
would be way beyond what could be done in a one-week hackathon, so
even drafting that list would be a waste of time.  The tr(1) utility
is no doubt among the most ill-designed parts of the POSIX "shell
and utilities" specification.

On top of that, tr(1) is notorious for portability issues, the
STANDARDS section of the manual page is of rather unusual length:

  https://man.openbsd.org/tr.1#STANDARDS

We recently fixed a bug in the groff build system related to such
tr(1) portability problems, and i just saw that both the problem
and the solution we selected are actually documented in the manual...
But that's by far not the only problem with it.

Yours,
  Ingo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]