bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#7401: [PATCH] split: --chunks option


From: Jim Meyering
Subject: bug#7401: [PATCH] split: --chunks option
Date: Thu, 18 Nov 2010 07:12:14 +0100

Pádraig Brady wrote:
> On 05/02/10 12:40, Pádraig Brady wrote:
>> I got a bit of time for the review last night...
...
>> Here is stuff I intend TODO before checking in:
>>  s/pread()/dd::skip()/ or at least add pread to bootstrap.conf
>>  fix info docs for reworked interface
>>  try to refactor duplicated code
>
> Attached is the finished split --number review.
> It has these additional changes:
>
> change all chunk function param names from n/tot to k/n
> report -n 1/3/2 as an error with "3" not "2"
> allow num chunks < file size with empty files created for the rest
> don't restrict chunk extraction to stdout to size_t limits
> fix chunk function param types to uintmax_t, off_t, size_t etc.
> reorganise limit checks so there are no integer overflows
> rewrite ofd_check
>   rename to ofile_open
>   fix off by one errors
>   when rotating files, append rather than truncate
>   only try to rotate open files for E[NM]FILE errors
>   if we run out of open files, give ENFILE or EMFILE as appropriate
>   if we run out of open files, close all files and resort
>    to open,write,close, as we'll effectively be doing that anyway.
> rewrite lines_chunk_split to use the same style as the other funcs
>   fix off by ones errors
>   fix chunk_size calculation
> merged lines_chunk_extract() into lines_chunk_split().
> merged lines_rr_extract() into lines_rr().
> don't use pread as it resets the offset so only valid if called once
>  also it's not available on some systems
> support input offsets in the chunk extraction modes. I.E.
>   (dd skip=1 count=0; split ...) < file
> For all -n modes, create all files if any data read
>   in case there are existing files,
>   and to signal any waiting fifo consumers
> Handle file size increases and decreases in {bytes,lines}_chunk_*
>   don't loop endlessly if file is truncated while reading
>   don't include buffer slop in output if file grows while reading
> added an --unbuffered option for use in the round robin cases
> added an --elide-empty-files option to suppress empty files
> Also use safe_read rather than full_read in round robin cases
>   so we can immediately copy input to output
> Beefed up the tests a lot
> Explained what each mode does exactly in the info docs

Very nice, indeed.  Thanks to both of you.
I've merged this with the latest and tweaked the result
to pass the new syntax-check (all just local -- I'll let Pádraig
push whenever he's comfortable).  clang had one false-positive
and valgrind showed no problem.

Pádraig, you've probably already done this, but I'll include it
anyway, in case it can save you a minute or two:

diff --git a/tests/misc/split-bchunk b/tests/misc/split-bchunk
index 4c79b70..aef450b
--- a/tests/misc/split-bchunk
+++ b/tests/misc/split-bchunk
@@ -17,7 +17,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.

 . "${srcdir=.}/init.sh"; path_prepend_ ../src
-test "$VERBOSE" = yes && split --version
+print_ver_ split

 # N can be greater than the file size
 # in which case no data is extracted, or empty files are written
diff --git a/tests/misc/split-lchunk b/tests/misc/split-lchunk
index fff9af5..4c7c20e
--- a/tests/misc/split-lchunk
+++ b/tests/misc/split-lchunk
@@ -17,7 +17,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.

 . "${srcdir=.}/init.sh"; path_prepend_ ../src
-test "$VERBOSE" = yes && split --version
+print_ver_ split

 # invalid number of chunks
 echo 'split: 1o: invalid number of chunks' > exp
diff --git a/tests/misc/split-rchunk b/tests/misc/split-rchunk
index 63e1518..3957f73
--- a/tests/misc/split-rchunk
+++ b/tests/misc/split-rchunk
@@ -17,7 +17,7 @@
 # along with this program.  If not, see <http://www.gnu.org/licenses/>.

 . "${srcdir=.}/init.sh"; path_prepend_ ../src
-test "$VERBOSE" = yes && split --version
+print_ver_ split

 require_ulimit_





reply via email to

[Prev in Thread] Current Thread [Next in Thread]