bug-textutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cut -c -b (UNCLASSIFIED)


From: Bob Proulx
Subject: Re: cut -c -b (UNCLASSIFIED)
Date: Sat, 31 Jul 2004 13:07:20 -0600
User-agent: Mutt/1.3.28i

Please keep followups to the mailing list.

Kirby, Jason B Mr NISA-DC/RABA Techhnologies wrote:
> Thank you for the perl solution. 
> perl -ne 'm/\(\s*(\d+)\s*:\s*(\d+)\s*\)\s+\S+\s+(\S+)/;print "$1:$2:$3\n";'
> It takes a bit less space than the solution I settled on before receipt of
> yours---perl is a nice language.
> 
> cut -c15-18,19-22,27-33 | sed s/\ da/:da/ | sed s/:\ /:/ | sed s/:\ /:/ |
> sed s/\(\ // | sed s/\(// 
> I don't want to say how long it took me to get that working. I'm not that
> good at programming and better programmers than I would have had it working
> in less time. 

What I don't like about the solution is that if the text you are
processing moves even one character left or right then the cut won't
do what you want it to do.  I call that "house of cards" programming.
(I realize you don't like it either and were looking for a better
solution.)  So right then and there I think cut is the wrong tool for
this task.

> cut -c17-18,20-22,28-33 -d:        
> cut -c17-18,20-22,28-33 -d\t
> Would have taken me two minutes to get working instead of time expended on
> sed, perl, ruby, python solutions, not to mention future maintainers would
> have less logic to fight their way through. Would you rather write perl or
> simply use the above?
>
> One more:
> cut -c17-18,20-22,28-33 -d: -s   
> Where -s signifies to get rid of any white space that occurs in the input
> columns selected.

I would rather parse out the data using perl.  Doing it that way is
much more tolerant of input format than cut by columns.  In this case
a regular expression was sufficient.  Sometimes you would have to do
more strict "parsing" of the input and a regular expression can't
always do that.  So yes, I think the perl solution is better than cut.

> UNIX Programmers appreciate UNIX because they value the "powerful tools"
> UNIX philosophy. Imagine grep without the -v and the -w switch, and all the

Your comment forced me to look up what 'grep -w' does.  It is not in
the standards for grep.  I have never used it before.

Just to prove I am a pedantic extreme case, instead of grep -v I use
sed '/pattern/d' instead.  The return code is what I want in a
script.  That is, whether sed ran successfully and not whether grep
had any pattern matches.  Therefore unless I am actually trying to
determine if a pattern matched I use sed.  In these cases it is more
correct because I do check the return code of my commands.

  if ! sed '/pattern/d' infile > outfile; then
    echo Error: ... error message about running sed on file ... 1>&2
    exit 1
  fi

If the disk has filled up then the above will detect that and show the
failure.  A simple grep -v pattern will not since the return code is
overloaded with whether it matched the pattern or not.

> force, and loss of efficiency required on the part of every script writer or
> programmer to force what would be those switches in written code.  I was
> offering this suggestion for enhancement in the same spirit that someone
> might have suggested something for mature grep that early grep didn't have.

Sorry if my tone was harsh.  I did not mean it to be.  I just
disagreed with the use of cut in this task of yours and stated my case
against it.

> I thought if anyone were to be responsive to user ideas for enhancement of a
> tool it would be guys who read the email at the address given on the man
> page.

And your suggestions are appreciated.  In general features in programs
start out just as you have done.  Suggest them to the mailing list and
discuss them there.  Or implement them and see how they work.  Don't
let me turn you off from posting suggestions and comments in the
future.  Please continue that.  But don't expect everyone (me!) to
agree with them either.  There are many readers of the mailing list.
I was just the first one to comment so far upon your suggestion.
  
In the case of the coreutils[1] they are the core utilities of the
system.  There is a need to keep them light and efficient.  Already
they are too big and heavy for many people's taste.  See the busybox
project that saw the need for a replacement just for that reason.

Bob

[1] The coreutils is the merging of fileutils, sh-utils, and
    textutils.  You might consider upgrading to them.  Since you
    posted to address@hidden I know you are using an old
    version.

    http://www.gnu.org/software/coreutils/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]