coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: tee: add --remove-cr option


From: Nikos Papaspyrou
Subject: Re: tee: add --remove-cr option
Date: Wed, 9 Feb 2022 23:59:42 +0100

On Wed, Feb 9, 2022 at 9:43 PM Bernhard Voelker
<mail@bernhard-voelker.de> wrote:
>
> On 2/9/22 20:14, Nikos Papaspyrou wrote:
> > [...] I want the input
> > (or the contents of FILE) to go to the standard output untouched. The
> > filtered input should only go to FILE1 and FILE2.
>
> So the original example is possible with the process substitution you already
> mentioned, e.g. with tr(1):
>
>   $ dd if=/dev/random of=/dev/null bs=1M count=1000 status=progress \
>       |& tee >(tr '\r' '\n' > LOG)
>
> or with sed(1):
>
>  $ dd if=/dev/random of=/dev/null bs=1M count=1000 status=progress \
>       |& tee >( sed 's/\r/\n/g' > LOG)

That's right, I already wrote that. The substitution you're suggesting
(both tr and sed) does not achieve the purpose I described, which makes
me think you (and Rob) found the description in my first post confusing.

Let me explain once again, in a different way this time. Suppose your
input is a text with two kinds of "lines": some ending with '\n' and
some ending with '\r'. When such a text is printed to standard output,
the lines ending with '\r' disappear, because they are overwritten by
the lines ending with '\n'. However, if the text is produced and printed
slowly, this can be used to implement a progress meter and this is
actually useful and appears a lot in practice.

Suppose we want to do two things with this input: (1) send it to the
standard output, as it is produced, unfiltered because we would like to
see the progress meter, and (2) send it to a number of files, filtering
out the lines ending with '\r' because we don't care to store the
meter's progression in the files. Using tee like this does not achieve
the purpose, because it doesn't do the filtering:

  $ tee FILE1 FILE2 ...

This problem can be broken in two parts: finding the right filter and
convincing tee to apply the filter as desired.

One solution for the first part is this:

  $ alias remove-cr="sed 's/\r/@REMOVE\n/g' | sed '/@REMOVE$/d'"

The second part can indeed be solved with process substitution. If
it's just one file, that's easy:

  $ tee >(remove-cr > FILE)

If there are multiple files, it will have to be a little more
complicated than:

  $ tee >(remove-cr > FILE1) >(remove-cr > FILE2) ...

because this applies the filter independently N times to the same input,
if there are N files. However, it's possible to perform it only once by:

  $ tee >(remove-cr | tee FILE1 FILE2 ... > /dev/null)

So, in conclusion, what the proposed option performs can indeed be
achieved by a quite involved construction of pipelines and process
substitution, which can all be wrapped up in a script that behaves
exactly like my tee --remove-cr.

-- 
Nikolaos Papaspyrou
Software Engineer
nikolaos@google.com

Google Germany GmbH
Erika-Mann-Strasse 33
80636 Muenchen

Geschaeftsfuehrer: Paul Manicle, Halimah DeLaine Prado
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg

Diese E-Mail ist vertraulich. Falls Sie diese faelschlicherweise
erhalten haben sollten, leiten Sie diese bitte nicht an jemand anderes
weiter, loeschen Sie alle Kopien und Anhaenge davon und lassen Sie
mich bitte wissen, dass die E-Mail an die falsche Person gesendet
wurde.

This e-mail is confidential. If you received this communication by
mistake, please don't forward it to anyone else, please erase all
copies and attachments, and please let me know that it has gone to the
wrong person.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]