|
From: | Pádraig Brady |
Subject: | bug#23073: wc reports wrong byte counts when using '--from-files0=-' |
Date: | Mon, 21 Mar 2016 15:16:35 +0000 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 |
On 21/03/16 00:59, William R. Fraser wrote:
When wc gets its list of files by reading from stdin, using the argument '--from-files0=-', it reuses the same fstatus struct for each file. The problem is that the 'wc' function checks the 'failed' member of this struct and if it is <=0, it skips doing fstat on the file. The main loop doesn't reset this value between files, so only the first file has fstat done on it. This can result in the 'wc' function seeking past the end of subsequent files and then over-reporting their byte counts. See the attached patch, which resets the fstatus struct in between files when reading the file list from stdin.
Ouch. This seems to be since v7.0-96-gc2e56e0 It would also mean there would be a lot of redundant reading if the initial file was significantly smaller than any other file. $ truncate -s1G wc.big $ touch wc.small $ printf '%s\0' wc.big wc.small | wc -c --files0-from=- 1073741824 wc.big 1073741760 wc.small 2147483584 total We'll submit a full patch in your name. thanks! Pádraig.
[Prev in Thread] | Current Thread | [Next in Thread] |