coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: coreutils feature requests?


From: Bernhard Voelker
Subject: Re: coreutils feature requests?
Date: Wed, 19 Jul 2017 23:08:00 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.6.0

On 07/19/2017 07:43 PM, Kaz Kylheku (Coreutils) wrote:
> It is nontrivial code. For instance if we look at how the function
> cut_bytes works in the implementation, what it's doing is simply
> doing a getchar() from the stream, and querying a data structure
> to determine whether the byte should be printed or not.
> (That data structure consists of a pointer which marches through
> field range descriptors in parallel with going through the data.)
> 
> cut_fields is more complicated due to the delimiting of fields,
> but essentially the same overall approach.
> 
> Basically, printing of fields that isn't sorted and de-duplicated
> is a rewrite of all parts of the utility other than command
> line processing and the printing of usage help text.
> 
> It's like two different programs in one, sharing a minimal
> skeleton.

+1

Another point: it is already documented that cut(1) output is
never good for reordering:

http://git.sv.gnu.org/cgit/coreutils.git/tree/doc/coreutils.texi?id=545f181f4e#n5938

  Note @command{awk} supports more sophisticated field processing,
  like reordering fields, and handling fields aligned with blank characters.
  By default @command{awk} uses (and discards) runs of blank characters
  to separate fields, and ignores leading and trailing blanks.
  @example
  @verbatim
  awk '{print $2}'      # print the second field
  awk '{print $(NF-1)}' # print the penultimate field
  awk '{print $2,$1}'   # reorder the first two fields
  @end verbatim
  @end example
  Note while @command{cut} accepts field specifications in
  arbitrary order, output is always in the order encountered in the file.

and even more: it suggests to use join:

  In the unlikely event that @command{awk} is unavailable,
  one can use the @command{join} command, to process blank
  characters as @command{awk} does above.
  @example
  @verbatim
  join -a1 -o 1.2     - /dev/null # print the second field
  join -a1 -o 1.2,1.1 - /dev/null # reorder the first two fields
  @end verbatim
  @end example

Is this sufficient?

Have a nice day,
Berny



reply via email to

[Prev in Thread] Current Thread [Next in Thread]