bug-datamash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Fixed incomplete and incorrect treatment of comments and tra


From: Tim Rice
Subject: Re: [PATCH] Fixed incomplete and incorrect treatment of comments and trailing whitespace
Date: Fri, 27 May 2022 22:08:16 +0000

Hey all,

Sorry I'm late to the party. I'm now ready to give this topic some attention.

- Handle trailing whitespace correctly:

 
https://github.com/dkogan/datamash/commit/19f9bff1df89f24ccc5a957f0175ef1c32559caa

Note that, in future, I think we should stick to keeping proposed patches on 
this list, rather than GitHub. There are a few reasons for that:

* GitHub's service is based on non-Free Software and so is not consistent with 
the values of a GNU project. Datamash benefits from being under the aegis of 
GNU, and we should reciprocate the friendship.
* People shouldn't need to go to GitHub to participate in discussions about GNU 
Datamash.
* Patches in the mailing list help keep a central history of all Datamash 
discussions, without reference to external services which might go down, delete 
the content, or change access rights on their whim. And even if the mailing 
list service itself has issues, all participants can still retain copies of the 
emails they received.
* Consistency in how contributions are made reduces confusion.


Anyway, onto the patch itself.

I wonder what people think the correct behavior should be for data generated 
like so:

```
#! /bin/bash
data=testing.txt
cat > $data << EOF
bar 5
bbb
EOF
sed -i '2s/$/   /' $data
```

That is,

bar 5
bbb
The second line has trailing spaces. At the moment, Datamash handles this in a 
way that is arguably correct:

```
$ ./datamash -W transpose < ~/tmp/testing.txt
bar     bbb
5
```

Furthermore, if trailing whitespaces are a problem for you, they can easily be 
removed by sed. I'm not convinced that datamash should need to handle all 
aspects of cleaning up messy data.

~ Tim



reply via email to

[Prev in Thread] Current Thread [Next in Thread]