bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Fixed incomplete and incorrect treatment of comments and tra


From: Erik Auerswald
Subject: Re: [PATCH] Fixed incomplete and incorrect treatment of comments and trailing whitespace
Date: Fri, 20 May 2022 09:14:34 +0200

Hi Dima,

On Thu, May 19, 2022 at 11:47:44PM -0700, Dima Kogan wrote:
> Erik Auerswald <auerswal@unix-ag.uni-kl.de> writes:
> 
> > From a quick glance at the code diff in the link, this seems to allow
> > comments inside a field, e.g., with datamash -H -C -t',' and the
> > following input:
> >
> >     # the next line is the header line
> >     one,two,three
> >     # the following line is the data line
> >     1,2#this is the 2nd field,3
> >
> > in the data line the string "#this is the 2nd field" would be skipped,
> > and the data line would have three fields with values 1, 2, and 3.
> >
> > Is that correct (I did not test it)? Is that the intended
> > functionality?
> 
> I haven't tested this yet, but that wasn't my intent. Comments should do
> what they do in perl and python and awk and everywhere else, so the last
> line should be interpreted as "1,2". I may have made a mistake in the
> implementation.

OK, that would be what I expect of such comments (ignore the rest of
the line).

> > Would you like to extend the documentation with a description of how
> > exactly comments are intended to work with -C, --skip-comments?
> 
> Sure, but let's agree on what we're doing first. It sounds like some
> people want to do this differently, and some don't want to do it at all.

I'd expect the details for trailing comments to vary.  For -W,
--whitespace, the following two example lines should probably work
identically:

    1  2  3# comment  ----> "1  2  3"
    1  2  3 # comment ----> "1  2  3"

When using a specific delimiter, for illustration I'll use ':', trailing
empty fields would be preserved:

    1:2:3# comment  ----> "1:2:3"
    1:2:3:# comment ----> "1:2:3:"

Whitespace inside a field should be treated as before.  I think (but
did not verify) that it is preserved:

    1:2:3 # comment  ----> "1:2:3 "
    1:2:3: # comment ----> "1:2:3: "

I'd still prefer this new way to support comments to be off by default
and only activated via a new option, e.g., --inline-comments, since it
is not backwards compatible.

Thanks,
Erik



reply via email to

[Prev in Thread] Current Thread [Next in Thread]