groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] *.man: Break URIs at points specified by the Chicago Style


From: Alejandro Colomar (man-pages)
Subject: Re: [PATCH] *.man: Break URIs at points specified by the Chicago Style
Date: Mon, 18 Oct 2021 09:37:46 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.14.0

Hi, Branden!

On 10/18/21 7:34 AM, G. Branden Robinson wrote:
Hi, Alex!

At 2021-10-17T21:33:24+0200, Alejandro Colomar wrote:
Break URIs before a single slash, not after.

I found no GNU-specific (or any other at all) source that recommends
breaking long URIs after a slash.  So follow Chicago Style and
break them before single slashes.

As far as I'm aware there is no such source.

Thus does it fall to me to blaze a trail.

I admit that it had not occurred to me until recently why breaking after
slashes is better than breaking before them.

1. A slash is not confusable as end-of-sentence punctuation as a dot is.
    In fact, it signals sentence continuation even if the URI context is
    missed or forgotten.

2. URIs can validly, and in fact commonly do, end with single slashes.

2a. Corollary: Inserting a break before slashes therefore invites the
     formatter to break a URI such that a single slash is set on the next
     line, or, if you don't have window/orphan control, on a subsequent
     column or page.

I read the same some time ago, but of the converse. for the URI <file:///asd/asd/asd/.>, if you break blindly after slashes, you'll end up with a single dot in the next line, which may look like something accidental, at best. But <> came to solve these issues, and also one could just not insert a break in such case, so I don't have strong arguments in either direction since both have their problems.

But you're right, trailing '/' are more common than '/.'.


2b. Corollary: Multiple trailing slashes at the end of a URI, when valid
     (this is rare) are vanishingly uncommon.  Therefore, breaking before
     slashes buys you at most one character cell of room on a line that
     must be broken (modulus any trailing punctuation, but that is under
     user control in the source document).  Moreover, in that very case,
     the lone trailing slash on the next output line is at risk of
     creating confusion or being mistaken as an error.  But in fact,
     trailing slashes on URIs are semantically significant[1], and a
     reader who is confident that didn't overlook the trailing slash on
     the next (line, column, page) when they copy-and-paste such a URI is
     at risk of retrieving the wrong resource.

This last sentence seems important. Enough to ignore Chicago. I'll follow your advise for new patches to the man-pages, and if there aren't many, fix the existing ones.


3. One might concede the above and still say that it's worth meeting
    Chicago (more than halfway) by applying their breaking rule to every
    slash in URI _except_ the last.  But having a different breaking rule
    for a trailing slash (or group of slashes) in a URI is more tedious
    to remember and possibly implement.  The sed expressions you crafted
    are pretty simple, and are made no more complex by shifting the
    location of the break point; that's an advantage worth preserving.

Yep.


I've written the following new material for the groff_man_style(7) page.

[[
        URIs can be lengthy; rendering them can result in jarring adjust‐
        ment  or  variations in line length, or troff warnings when a hy‐
        perlink is longer than an output line.  The application  of  non-
        printing break point escape sequences \: after each slash (or se‐
        ries  thereof), and before each dot (or series thereof) is recom‐
        mended.  The former practice avoids forcing a trailing slash in a
        URI onto a separate output line, and the latter helps the  reader
        to  avoid  mistakenly interpreting dot(s) at the end of a line as
        periods or ellipses.  Thus,
               .UR http://\:example\:.com/\:fb8afcfbaebc74e\:.cc
        has several potential break points in the URI shown.  The \:  es‐
        cape  sequences  are ignored when supplied to device control com‐
        mands for embedding in hyperlink-aware output drivers.
]]


Acked-by: Alejandro Colomar <alx.manpages@gmail.com>

Before I land it, I need to do some homework regarding the portability
of the \: escape, so that I can make honest disclosures in the requisite
addition to the "Portability" subsection of this page.

I guess I have another pin for my Russell Harper voodoo doll now.[2][3]

Please let me know if you find any inconsistencies in our URI breaking
practices in the groff man pages.  I inferred that you said the existing
style was consistent, but I'm not sure and it could have been wishful
reading on my part.  :)

Regards,
Branden

[1] 
https://stackoverflow.com/questions/5948659/when-should-i-use-a-trailing-slash-in-my-url/
[2] https://www.linkedin.com/in/russell-harper-70394718
[3] 
https://web.archive.org/web/20171107164742/http://www.heracliteanriver.com/?p=324


Huh! After reading 3, I should print a copy of the Chicago style along with one of the GNU coding standards, and burn them together. As one said, it’s a great symbolic gesture. :-)

Cheers,

Alex


--
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]