groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: *roff hyphenation trivia challenge


From: G. Branden Robinson
Subject: Re: *roff hyphenation trivia challenge
Date: Tue, 2 Apr 2024 12:00:25 -0500

At 2024-04-02T18:51:51+0200, Tadziu Hoffmann wrote:
> > > Also interesting to see that in this word, the hyphenation
> > > patterns don't suggest a hyphenation opportunity after "anti".
> 
> > The leading `\%` prevents that.
> 
> Sorry, I meant even without "\%".  With a line length of 1 en,
> and without any "\%" at all, groff prints
> 
>   an-
>   tidis-
>   es-
>   tab-
>   lish-
>   men-
>   tar-
>   i-
>   an-
>   ism

Yes.

> and Heirloom troff prints
> 
>   an-
>   tidises-
>   ta-
>   blish-
>   men-
>   tari-
>   an-
>   ism
> 
> TeX gives the same as groff since it uses the same
> hyphenation patterns (groff borrowed them from TeX).

Yes.  And as you noted, its weird that these patterns don't admit
"anti-".

> For "antidisestablishmen\%tarianism", groff prints
> 
>   antidisestablishmen-
>   tar-
>   i-
>   an-
>   ism
> 
> (which I think is strange),

Yeah, that's a bug.  I think the "don't automatically hyphenate this
word" flag is getting unconditionally reset after an output line is
flushed, and it should not be.

> while TeX and Heirloom troff print
> 
>   antidisestablishmen-
>   tarianism
> 
> which I think is the only reasonable way of handling this case.

Agreed.

> (I remember in Word it was only possible to add additional
> hyphenation points, but not to inhibit existing ones, which
> is a terrible idea if one of the builtin ones turns out to
> be wrong.)

Yup.  So our example document need only do this:

$ hyphen 'anti\%disestablishmentarianism'
anti‐dis‐es‐tab‐lish‐men‐tar‐i‐an‐ism

https://froude.eu/groff/examples/hyphenation-points.html

> For "\%antidisestablishmen\%tarianism", Heirloom troff does not
> hyphenate at all (even if the word contains additional "\%"),
> whereas groff and TeX do the same as they did with only the
> inner "\%".  (Also, "\&" is not a letter, so a leading "\&"
> should not influence hyphenation at all.)

Agreed.  `\&` is not a letter, but it works like a letter in some ways,
namely in suppressing end-of-sentence detection.  It's truly a magical
device.

https://www.gnu.org/software/groff/manual/groff.html.node/Dummy-Characters.html

> With *only* the leading "\%", ("\%antidisestablishmentarianism"),
> none of the formatters hyphenates, which is correct.

Agreed.

> Of the three formatters, TeX's behavior appears to be the most
> sensible to me, i.e., if the word contains one or more "\%",
> *only* those points (but all of them) will be considered
> for hyphenation.

Yes.  Let's fix that.

Regards,
Branden

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]