groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Escaping hyphens ("real" minus signs in groff)


From: Alejandro Colomar
Subject: Re: Escaping hyphens ("real" minus signs in groff)
Date: Sun, 7 Mar 2021 01:06:04 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.7.1

Hey Michael & Branden!

On 1/22/21 4:56 AM, G. Branden Robinson wrote:
> Hi Michael!
> 
> At 2021-01-21T12:03:13+0100, Michael Kerrisk (man-pages) wrote:
>> I appreciate your long answer *very* much. But, I'm glad you started
>> with the short answer :-).
> 
> Cool!  But beware, from such pressures is the practice of top-replying
> born...  ;-)
> 
>>> Another issue to consider is that as PDF rendering technology has
>>> improved on Linux, it has become possible to copy and paste from PDF
>>> documents into a terminal window.  In my opinion we should make this
>>> work as well as we can.  Expert Linux users may not ever do this,
>>> wondering why anyone would ever try; new Linux users will quite
>>> reasonably expect to be able to do it.
> [...]
>>> And I mean copy-and-paste not just from PDF but from a terminal
>>> window.
>>
>> Yes, but I have a question: "\-1" renders in PDF as a long dash 
>> followed by a "1". This looks okay in PDF, but if I copy and paste
>> into a terminal, I don't get an ASCII 45. Seems seems to contradict
>> what you are saying about cut-and-paste above. What am I missing?
> 
> The gap between aspiration and implementation.  I don't think the
> "copy-and-paste from PDF to terminal window" matter is completely sorted
> out yet.
> 
> I'm a strident prescriptionist about preserving the distinction between
> "-" and "\-" in roff documents, notably including man pages in part
> because it affords us more room to design around this problem.
> 
> ASCII and ISO 8859 unified the hyphen and minus characters.  AT&T troff
> and all of its descendants distinguished them.  Unicode also
> distinguishes them.  But Unix has a habit of calling ASCII 055 (45
> decimal) a "dash", and moreover, to much software, only the numerical
> value of the code point is important.
> 
> It's quite possible that for man(7) documents rendering to PDF, we
> should perform the following mapping (in the man macros).
> 
> .if '\*[.T]'pdf' \
> .  char \- \N'45'
> 
> This didn't come up in my argument with (mostly?) BSD people because (1)
> the immediate issue that raised concern had to do with the grave accent
> and apostrophe instead and (2) everybody in that camp who spoke up on
> the matter said they seldom, if ever, render man pages to PostScript or
> PDF.  By that token, the above 2-liner may not be a controversial matter
> to the people I was arguing with.  :)
> 
> Consider what would happen to the appearance of PDF-rendered man pages
> if we encouraged all \- escaped hyphens to be rewritten as plain hyphens
> in the source first, and did the following to mandate uniformity.
> 
> .if '\*[.T]'pdf' \{\
> .  char \- \N'45'
> .  char - \N'45'
> .\}
> 
> ...just as is currently done for the 'utf8' output driver, whose second
> line I want kill off.
> 
> I feel that responsible stewardship of the groff man macro
> implementation means considering the needs of diverse audiences.
> 
>> I don't really have any other questions, but I have tried to distill 
>> the  above into some text in man-pages(7) to remind myself for the
>> future:
>>
>> [[
>> .PP
>> The use of real minus signs serves the following purposes:
>> .IP * 3
>> To provide better renderings on various targets other than
>> ASCII terminals,
>> notably in PDF and on Unicode/UTF\-8-capable terminals.
>> .IP *
>> To generate glyphs that when copied from rendered pages will
>> produce real minus signs when pasted into a terminal.
>> ]]
>>
>> Seem okay?
> 
> What a "real minus sign" is is a fraught issue[1], but if for the
> purposes of man-pages(7) it means the ASCII/ISO hyphen-minus, then yes,
> I think it's good enough.
> 
> Regards,
> Branden
> 
> [1] especially in light of the \[mi] special character escape and the
>     existence of U+2212 :-/
> 

I just found another good reason to use '\-'.  I was searching for an
option of curl in their man page, and I used '/    -s', as I usually do
when I search for those.  To my surprise, it didn't find anything, in
fact, '/-' just showed two appearances of the minus sign.  However, if I
copy and paste the character from one of the options and paste it into
the pager search command line, then it finds the options.  I already
reported the bug to them.

I checked that in our pages, we can search options (see time.1).  I
wonder if there are some cases where we're producing some weird
character that can't be easily searched for.

Regards,

Alex

-- 
Alejandro Colomar
Linux man-pages comaintainer; https://www.kernel.org/doc/man-pages/
http://www.alejandro-colomar.es/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]