coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: potential feature addition to coreutils' sort.c: print at most N lin


From: Pádraig Brady
Subject: Re: potential feature addition to coreutils' sort.c: print at most N lines
Date: Mon, 04 Mar 2013 00:47:33 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2

On 03/03/2013 05:32 PM, James Dowdell wrote:
> I'm considering writing a patch for sort.c to add a new feature, related to a 
> stackoverflow inquiry I wrote 
> (http://stackoverflow.com/questions/14882897/what-standard-commands-can-i-use-to-print-just-the-first-few-lines-of-sorted-out).
> 
> This would be my first patch, and this is my first time messaging a gnu list; 
> apologies if I'm "doing it wrong."
> 
> I use GNU sort a lot, and routinely find myself in the situation of 
> executing, e.g.:
> 
> $ sort ... | head -n 1000
> 
> This can be very unnecessarily slow when the input is huge, because sort does 
> a lot of work that head throws away.
> 
> I propose a new parameter, "-H, --head=NLINES", which has sort only print at 
> most NLINES of output.  More than just a filter at the end like | head, it 
> would avoid unnecessary sorting on more than NLINES of output.
> 
> I want to know the procedure for submitting a patch, and the likelihood that 
> such a patch would even be considered, before I spend time to parse the whole 
> sort.c file and propose a complete and rigorous solution (which would be 
> analogous to submitting the patch).  From a quick glance at the source, my 
> current strategy would be to alter the merge nodes when this parameter is set 
> so that the number of lines listed per node is clamped to NLINES.  While less 
> efficient than an ideal solution, it would be more efficient than what's 
> currently in place, and has the benefits of minimal code edits and negligible 
> negative performance impact on mainstream use when the parameter is not 
> passed.
> 
> All feedback welcome, thank you.

There is general agreement that this is worthwhile.

Please read these first:
  http://lists.gnu.org/archive/html/bug-coreutils/2004-04/msg00157.html
  http://lists.gnu.org/archive/html/bug-coreutils/2009-07/msg00019.html

As for contributing the patch, it would be much appreciated.

For contribution details, please see the HACKING file:
http://git.sv.gnu.org/cgit/coreutils.git/plain/HACKING

In summary you would submit a patch against the latest git tree,
to address@hidden.  Also for a patch of this significance,
you would need to follow the copyright assignment procedure.

thanks!
Pádraig.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]