bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#23073: wc reports wrong byte counts when using '--from-files0=-'


From: Bernhard Voelker
Subject: bug#23073: wc reports wrong byte counts when using '--from-files0=-'
Date: Mon, 21 Mar 2016 18:20:25 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0

On 03/21/2016 04:16 PM, Pádraig Brady wrote:
On 21/03/16 00:59, William R. Fraser wrote:
When wc gets its list of files by reading from stdin, using the argument
'--from-files0=-', it reuses the same fstatus struct for each file.

The problem is that the 'wc' function checks the 'failed' member of this
struct and if it is <=0, it skips doing fstat on the file. The main loop
doesn't reset this value between files, so only the first file has fstat
done on it.

This can result in the 'wc' function seeking past the end of
subsequent files and then over-reporting their byte counts.

See the attached patch, which resets the fstatus struct in between files
when reading the file list from stdin.

Ouch. This seems to be since v7.0-96-gc2e56e0
It would also mean there would be a lot of redundant reading
if the initial file was significantly smaller than any other file.

$ truncate -s1G wc.big
$ touch wc.small
$ printf '%s\0' wc.big wc.small | wc -c --files0-from=-
1073741824 wc.big
1073741760 wc.small
2147483584 total

We'll submit a full patch in your name.

Interesting enough, there seems to be a threshold to trigger the bug:

$ touch wc.small

$ seq 1000 > wc.big
$ printf '%s\0' wc.big wc.small | wc -c --files0-from=-
3893 wc.big
0 wc.small
3893 total

$ seq 10000 > wc.big
$ printf '%s\0' wc.big wc.small | wc -c --files0-from=-
48894 wc.big
45067 wc.small
93961 total

That's why I couldn't reproduce it this morning.

Have a nice day,
Berny





reply via email to

[Prev in Thread] Current Thread [Next in Thread]