I kind of feel like the -C option shouldn't exist at all and anything like removing comments should be done first by sed or whatever. But it's there, and I'd worry about changing it's current behavior breaking backwards compatibility.
All for the trailing whitespace part though.
Hi,
On Mon, May 16, 2022 at 09:03:42AM +0200, Erik Auerswald wrote:
> On Sun, May 15, 2022 at 06:06:21PM -0700, Dima Kogan wrote:
> > Addresses two related issues:
> >
> > - Comments that didn't block out a whole line weren't being properly ignored by
> > -C. Lines such as 'bar 5#xxx' didn't ignore the '#xxx' as they were supposed
> > to
>
> I think that would be a new feature. The --help output states:
>
> -C, --skip-comments skip comment lines (starting with '#' or ';'
> and optional whitespace)
>
> As far as I understand the documentation, the -C, --skip-comments option
> was intended to skip complete lines.
Treating any ';' in a line as starting a comment would interfere with
using ';' as field separator. But using ';' as field separator is common
with simple CSV-like formats when the locale's decimal separator is a ','.
I do not think that the -C, --skip-comments behavior of GNU datamash
should change to recognize comments at the end of a data line
unconditionally.
Thanks,
Erik