[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn
From: |
Leonard Janis Robert König |
Subject: |
bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt? |
Date: |
Sat, 13 Feb 2021 21:28:49 +0100 |
User-agent: |
Evolution 3.38.4 (by Flathub.org) |
On Sat, 2021-02-13 at 21:15 +0100, Erik Auerswald wrote:
> Hi,
>
> On 13.02.21 19:29, Leonard Janis Robert König wrote:
> >
> > first: Thank you very much for the work, I really owe you one!
>
> You're welcome. :-)
>
> > On Sat, 2021-02-13 at 17:58 +0100, Erik Auerswald wrote:
> > > On 13.02.21 15:17, Erik Auerswald wrote:
> > > > On 11.02.21 20:20, Erik Auerswald wrote:
> > > > > On Thu, Feb 11, 2021 at 06:09:28PM +0100, Leonard Janis
> > > > > Robert
> > > > > König
> > > > > wrote:
> > > > > > On Thu, 2021-02-11 at 16:45 +0100, Erik Auerswald wrote:
> > > > > > > On Thu, Feb 11, 2021 at 04:12:54PM +0100, Leonard Janis
> > > > > > > Robert
> > > > > > > König wrote:
> > > > > > > > On Thu, 2021-02-11 at 13:00 +0100, Erik Auerswald
> > > > > > > > wrote:
> > > > > > > > > On Wed, Feb 10, 2021 at 01:42:29PM +0100, Leonard
> > > > > > > > > Janis
> > > > > > > > > Robert
> > > > > > > > > König wrote:
> > > > > > > > > > I'm sorry if I this is not a bug but to be
> > > > > > > > > > expected,
> > > > > > > > > > but I thnk
> > > > > > > > > > pr doesn't get the alignment of tabs in multicolumn
> > > > > > > > > > output
> > > > > > > > > > right. [...] This seems *kind* of related to
> > > > > > > > > > multi-
> > > > > > > > > > column
> > > > > > > > > > merged output, as was discussed some years ago
> > > > > > > > > > here:
> > > > > > > > > > https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html
> > > > > > > > >
> > > > > > > > > This thread contains the bug-introducing patch in
> > > > > > > > > message
> > > > > > > > > https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00160.html
> > > > > > > > >
> > > > > > > > > This is commit
> > > > > > > > > 553d347d3e08e00ee4f9df520b37c964c3f26e28.
> > > > > > > > > That commit removed the 'assume -e' part of the POSIX
> > > > > > > > > description
> > > > > > > > > of the -COLUMN option from GNU pr.
> > > > > > > > [...]
> > > > > I have found a fix to the problem described by you. I am
> > > > > quite
> > > > > sure that
> > > > > this is not *correct*, but I did not find a way to make
> > > > > print_sep_string()
> > > > > account for tabs that did not break quite a few existing
> > > > > tests,
> > > > > even if
> > > > > the merged files problem from 2007 and this columnating bug
> > > > > were
> > > > > both
> > > > > fixed. Thus I just tighten the 2007 bug fix to apply in less
> > > > > cases.
> > > > > This way all existing tests pass, and a new one pertaining to
> > > > > this bug
> > > > > report passes, too. I do think this is in the same spirit as
> > > > > the
> > > > > "fix"
> > > > > from 2007 (commit 553d347d3e08e00ee4f9df520b37c964c3f26e28).
> > > >
> > > > I think the attached patch is a better fix than my previous
> > > > one,
> > > > because it applies the special treatment of TAB as separator
> > > > more
> > > > consistently. It may still not be complete (the code seems
> > > > quite
> > > > convoluted to me) but I do think it improves the situation
> > > > significantly, and does not make it worse.
> >
> > Hm, I'm not sure whether I understand this special case. When we
> > have
> > a tab as column separator, doesn't this imply that the second
> > column is
> > starting on a position n*8, (effectively equivalent to the first
> > column), thus guaranteeing that the alignment is honored? So, if
> > my
>
> Whatever the reason (perhaps conforming to POSIX, perhaps other pr
> implementations doing the same), GNU pr implements a special
> treatment for TAB as column separator, and the thread from 2007
> implies that the pr from HP-UX does as well.
>
> The POSIX spec says:
>
> "-s[char]
>
> Separate text columns by the single character char instead of
> by the appropriate number of <space> characters (default for
> char shall be <tab>)."
>
> https://pubs.opengroup.org/onlinepubs/9699919799/utilities/pr.html
>
> So use of -s needs to always result in one separator character
> between columns.
>
> This is implemented by GNU pr, and seemingly by pr from HP-UX, too
> (
> https://lists.gnu.org/archive/html/bug-coreutils/2007-03/msg00121.html
> )
>
> Of all the printable ASCII characters, only TAB results in
> interactions with "Tabification," i.e., turning TABs into spaces
> on input and spaces into TABs on output. Thus only TAB as
> separator may require the special treatment of disabling
> "Tabification."
>
> Omitting this special treatment resulted in the bug from 2007.
>
> Removing the implicit "-e" and "-i" from "-NUMBER" and "-m"
> to fix the 2007 bug resulted in this bug (bug#46422), and does
> not conform to the POSIX specification nor to the GNU pr info
> documentation.
>
> My v3 patch restricts this special treatment of "-s" to just
> the cases where it is used without specifying a separator and
> thus using the default of TAB, or when it is used with a single
> TAB ("-s$'\t'"). Thus it restricts the 2007 change from commit
> 553d347d3e08e00ee4f9df520b37c964c3f26e28 to affect only those
> use cases it should affect, instead of all multi-column use cases.
>
> It may be possible to add some appropriate special treatment for
> TAB as separator without disabling "Tabification." But I do not
> know how. Just accounting for the output position change resulting
> from printing a TAB in print_sep_string() does not work, i.e.,
> breaks many of the existing tests.
This makes sense then, yes. In theory, this'd be the best approach,
but I don't see that happening with the current code base... . The
issue is less with enabling/disabling tabification, since a tab
character should, as I wrote, always align the second column -- but
with the rather weird behavior of pr sometimes "miscounting" that I
outlined a few mails earlier, after which I gave up fixing :-)
>
> > [...]
> > That being said, I don't see this exact distinction reflected in
> > the
> > code, so perhaps I just misunderstood.
>
> Disabling "Tabification" only when "-s" was active is missing. That
> resulted in the 2007 bug. Making the needed special treatment always
> used fixed the 2007 bug, but broke your use case.
>
> That some special treatment is needed and intended can be gleaned
> from the following comment (with line numbers from pr.c in the
> current master branch @ 2de30c7350a77b091afa1eb284acdf082c0f6aa5):
>
> 1031 /* It's rather pointless to define a TAB separator with column
> 1032 alignment */
>
> My patch adds the special treatment, since it works both for the 2007
> bug and this bug (bug#46422).
>
> > > It seems to me as if "untabify_input = true;" should be re-
> > > introduced
> > > in one additional place to fix the regression from commit
> > > 553d347,
> > > please see the attached patch version 3.
> > >
> > > > I'd like to ask the GNU Coreutils maintainers to consider
> > > > merging
> > > > the attached patch.
> > >
> > > The latest version, i.e., v3 for now.
> >
> > I can only second this, with the patch my rather obscure (and
> > complex)
> > use case of printing thousands of lines of code works properly now!
>
> Thanks for testing!
Thanks all to you
~leo
- bug#46422: 'pr' screws up tabstops in multicolumn outpt?, (continued)
- bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/10
- bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/11
- Message not available
- Message not available
- bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Leonard Janis Robert König, 2021/02/11
- bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/11
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/13
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/13
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Leonard Janis Robert König, 2021/02/13
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/13
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?,
Leonard Janis Robert König <=
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/14
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Leonard Janis Robert König, 2021/02/14
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Pádraig Brady, 2021/02/14
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Erik Auerswald, 2021/02/15
- bug#46422: [PATCH] Re: bug#46422: 'pr' screws up tabstops in multicolumn outpt?, Pádraig Brady, 2021/02/15