coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: wc enhancement possibility


From: Pádraig Brady
Subject: Re: wc enhancement possibility
Date: Thu, 30 Jun 2016 09:46:31 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0

On 30/06/16 02:52, Allan Chandler wrote:
> Good arbitrary-time-of-day, people.
> 
> I helped a colleague out today with a "wc" problem they were having with line 
> counts when the final line of a file did not have a newline at the end of it.
> 
> Now this is technically not a bug since the doco explicitly states that "wc 
> --lines/-l" gives the count of newline characters, not the count of lines. 
> And, in any case, it could be argued that the definition of a line SHOULD be 
> "zero or more characters followed by a newline".
> 
> However, this has caused confusion before in that a non-terminated final line 
> COULD be considered a line, especially if you're just outputting the file.
> 
> I don't propose changing the behaviour of "--lines" since that would result 
> in chaos for a large number of scripts in the world currently using it, and I 
> don't wish to spend the rest of my life fighting off affected parties, 
> Omega-Man-against-the-zombies style, because of the trouble I caused :-)
> 
> However, I wonder whether it would be worthwhile adding another option which 
> included a final non-terminated line, something like "--lines-all".
> 
> I've seen some "wc" suggestions turned down in the past 
> (https://www.gnu.org/software/coreutils/rejected_requests.html) but these 
> seem to generally be requests for things that other tools are better to 
> provide.
> 
> Keeping in mind the philosophy of UNIX's "a tool should do one thing and do 
> it well", and the fact that the purpose of "wC" is most definitely counting 
> things, it appears it may be a better fit in the "wc" program itself rather 
> than doing it as part of a pipeline.
> 
> Anyway, I'm really just raising it as a discussion point. Tell me what you 
> think...

Maybe.

Note one of the reasons wc -l doesn't count a non \n terminated line at end of 
file
is so that counts are accurate for split files for example.

If we were to add an option it would be a flag type option
rather than selecting a different mode.
But it mightn't be too much overhead to pre-process the data?
I.E. something like:

  wc-all-lines() { sed '$a\' | wc -l; }

cheers,
Pádraig



reply via email to

[Prev in Thread] Current Thread [Next in Thread]