coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [head] wished an option to continue consuming the input after the sp


From: Bob Proulx
Subject: Re: [head] wished an option to continue consuming the input after the specified number of lines has been read
Date: Wed, 17 Oct 2012 12:42:04 -0600
User-agent: Mutt/1.5.21 (2010-09-15)

Thibault LE PAUL wrote:
>Bob Proulx wrote:
> > Depending upon what you want to do I would do something like this
> > using sed to do the difference to either part.
> >
> >   $ seq 1 10 | sed '1,3s/^/head /;7,10s/^/tail /'
> >   $ seq 1 10 | sed -n '1,3s/^/head /p;7,9s/^/tail /p'
> >   $ seq 1 10 | awk 'NR<=3{print "head ",$0} NR>7{print "tail ",$0}'
>
>It's difficult that way without knowing a priori the line numbers,
>if you want the tail -n2 equivalent. I suppose that tail is using
>something like a rotating line buffer.

Ah...  Yes.  That is very true and a very compelling point.  I hadn't
considered that part of the problem.  Like directions that say "Turn
just before you can see barn."

> > And I could see that tee exited due to the write on the fifo finishing
> > before the write to stdout and so the tail did not get all of the
> > file.  I consider that a normal behavior.  Yes, reading all of the
> > input and discarding it in the head process will allow tee to write
> > all of the output.  But that is a lot of extra data writing that is
> > wasteful and unused and simply thrown away and therefore I would avoid
> > doing it that way.
>
> I agree. However, if you use temporary file storage, it's worse.

I agree about the temporary storage.  But I haven't been suggesting
using any temporary storage.  At least not yet. :-)

> You write it to disk instead of socket, and you read it from disk
> instead of socket. Even if you write twice on socket instead of
> once to disk, I think it's better. Further the performance point,
> you may not want to use storage that you are not sure to get
> available : you don't know how big is your input. By the way, you
> are sure to get available CPU.

Good point.

> > Also since the background process is asynchronous the order of emitted
> > output isn't specified.  It is possible that the background process
> > would be scheduled later (kernel process scheduling) and then the
> > output of the two processes might appear in a different order.  It is
> > tickling a lot of possible problems.  Best to those avoid entirely.
>
> Either :
> 1) add wait :
>
> cat /tmp/fifo1|(head -n2;cat>/dev/null)|sed 's/^/head&/'&
> cat /usr/share/mysql/errmsg-utf8.txt|tee /tmp/fifo1|tail -n2|(wait;sed 
> 's/^/tail&/')

Very clever.

> Ooops ! It works but it may go into deadlock. The waited event
> should be the end of head, not the end of cat.

Yes.  An example of the types of complications that can occur.  But if
those are needed then they can be done.

> 2) be independent upon scheduling  :
>
> cat /tmp/fifo1|(head -n2;cat>/dev/null)>  /tmp/head&
> cat /usr/share/mysql/errmsg-utf8.txt|tee /tmp/fifo1|tail -n2>  /tmp/tail

Obviously this has the problems already discussed.  I prefer using
wait to join the forked process flow.

Bob



reply via email to

[Prev in Thread] Current Thread [Next in Thread]