groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [BUG] italics run past where they should


From: Alejandro Colomar
Subject: Re: [BUG] italics run past where they should
Date: Thu, 21 Jul 2022 12:46:31 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.0.2



On 7/21/22 11:25, G. Branden Robinson wrote:
Hi Alex,

At 2022-07-20T16:58:34+0200, Alejandro Colomar wrote:
I'm not sure if this is a groff(1) bug, or less(1), or who knows...

 From your description I suspect a bug either in less(1) or your terminal
emulator.

Yup, I also had that feeling, but the only invariable thing in my experiments is groff(1). However, I considered the possibility that a slightly imperfect ordering in escape sequences might be interpreted correctly normally, but incorrectly after the search does some highlighting... Or that the combination of escape sequences used by less(1) an groff(1) might be incompatible with eachother?

I generalized the reproduction of the bug:

In any line with alternate highlighting .BI, seraching for a _whole_ italics word will trigger the bug.


I've seen it sporadically, but when I tried to reproduce it, I didn't
remember how I had triggered it, so I couldn't report it.  Now I can
consistently reproduce it.

I _can't_ reproduce it.  I am using less 581.2 and XTerm #370 from
Debian bullseye.

I can reproduce it in both xfce4-terminal and xterm, so I'd initially discard a bug in those (although it could be a bug in both...)

I also tried another pager, just to see if I could still reproduce it:
batcat (apt-get install bat) also reproduces the issue.

$ man -P batcat clone
/flags


I'm running Sid.

The versions of involved software are:


$ xterm -version
XTerm(372)
$ xfce4-terminal --version | head -n1
xfce4-terminal 1.0.4 (Xfce 4.16)
$ less --version | head -n1
less 590 (GNU regular expressions)
$ batcat --version
bat 0.21.0
$ groff --version | head -n1
GNU groff version 1.23.0.rc1.2366-0328
$ man --version
man 2.10.2


I'd expect the issue to be in less(1), because of how I trigger it,
but it's weird, because I can't reproduce it with mandoc and less, so
I attribute it to groff(1) for the moment.

Something to keep in mind is that grotty(1) (by default[1]) and
mandoc(1) take different approaches to terminal capabilities.  The
latter's maintainer, Ingo Schwarze, has on this mailing list declaimed a
distaste for ISO 6429 (a.k.a. ECMA-48) escape sequences, so mandoc(1)
produces bold and italics (actually, underlining) by overstriking, i.e.,
including backspace literals in its output.  VT100-ish terminal
emulators honor these but don't _interpret_ them--they do what is
commanded quite literally, destructively backspacing and replacing
character cell contents, with the result that neither bold nor
underlined characters appear as such.  So this styling information
disappears.

The less(1) program interprets these sequences _and translates them into
ECMA-48 escape sequences_, recovering the "graphic renditions" from the
input stream, as the standard would put it.

less(1) also, however, _refuses_ by default to interpret those same
escape sequences, which it happily produces, when they occur on its
input stream.  Some people, like Ingo, claim this to be advantageous for
security reasons; I don't know if Mark Nudelmann himself does.  I am
dubious.

Because people almost always view man pages on the terminal via a pager
program, they come to confuse terminal capabilities with pager
capabilities, and thus they quite wrongly insist that grotty(1) is
incorrect to emit ECMA-48 escape sequence even though every (practically
every?) terminal emulator in use on *nix systems honors the basic subset
of them that encodes renditions for bold and underlining.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=312935

This observation withstands even the fact that some terminal emulators
can't _render_ such styles; the Linux console driver on my system, for
example, won't show you bold, underscored, or bold underscored text, but
clearly _recognizes_ them because it renders each in a different color.)

So, to trigger the bug, do the following potion:

# I reproduced it in clone(2), installed from my git tree,
# but I also reproduced it with the clone(2) from my system,
# which hapens to be manpages-dev 5.13-1, so it should be easily
# reproducible.

$ man clone

# Now have a look at the synopsis.
# You'll notice (or actually not notice) no weird formatting,
# because there isn't

# Then, within less(1), search for 'flags' with /flags

/flags

# This should change the underscoring of some words after the match.

groff does not re-render the page because you did a search.  groff and
grotty have exited (or blocked waiting to write to a pipe) by the time
the pager runs.

So my suspicion is that some state has gotten desynchronized in less's
idea of the screen contents, or in your terminal emulator's.

# If you close and open again the man page,
# you'll see the good formatting again.

This is consistent with my hypothesis.

# If I run the following command, then I can't reproduce it,
# which is why I suspect that it's a problem in groff(1).

$ man -w clone | xargs mandoc | less
/flags

This, too, is consistent with my hypothesis.  To try to verify it, you
might re-render the page using groff (via man(1) is fine) with
GROFF_NO_SGR=1 in your environment.


The following command fixes the bug:

$ GROFF_NO_SGR=1 man clone


I hope this investigation helps you reproduce it, or at least have a more precise idea of what the bug might be. If you need me to test some moar commands, please tell me :)


Cheers,

Alex



If doing so makes the bug similarly go away for groff, then you know
that the problem is with the generation of ECMA-48 escape sequences by
grotty(1), or with the maintenance of screen buffer state by less(1) or
your terminal emulator.

I feel that the former is unlikely because I use grotty(1) many times a
day and it is quite simple in its production of ECMA-48 escape
sequences; it doesn't test the value of $TERM, terminal capabilities, or
anything like that.  It fires blindly, which can be criticized, but for
this scenario has the virtue of telling us that if it did have this sort
of problem, many other people would see many more defects in its output
constantly.

I do not rule out a defect in grotty's ECMA-48 sequence production; I
simply think such is an unlikely explanation for the problem you
describe.  If there is such a defect in grotty, it is more simply
established by examining a hex dump of the program's output.  ECMA-48
escape sequences are recondite but decipherable[2].  If there is such a
bug, I am strongly motivated to fix it.

Regards,
Branden

[1] Debian switches this default around and (IMO, confusingly) adds
     another environment variable for it, GROFF_SGR.  (When I
     corresponded with Gavin Smith of Texinfo about similar issues, my
     ignorance of this downstream change made him think me quite stupid
     :-O ).  I have my system configured to restore the upstream default.
     Among other things, doing so enables me to view man pages in the
     terminal with a true italic style (well, oblique at any rate) by
     passing grotty "-P -i".

[2] 
https://www.ecma-international.org/wp-content/uploads/ECMA-48_5th_edition_june_1991.pdf

--
Alejandro Colomar
<http://www.alejandro-colomar.es/>

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]