[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: inplace stream editing with tee --quiet --overwrite

From: Jim Meyering
Subject: Re: inplace stream editing with tee --quiet --overwrite
Date: 24 May 2001 07:56:45 +0200
User-agent: Gnus/5.090003 (Oort Gnus v0.03) Emacs/21.0.104

Roman Czyborra <address@hidden> wrote:

| Dear fellow GNUzees,
| how do you shorten (shuffle up) a huge logfile or mailbox
| without losing the latest appends or the inode hardlinks?
| There are several approaches to inplace editing that involve temporary
| files.  "cp -p A B && tail B > A && rm B" reuses A's inode but
| requires enough disk space for two copies of the unshortened A and
| desires A.lock to prevent lost appends during the lengthy copy
| operation.  The more appending "tail A > B && mv B A" approach leaves
| anybody attached to the old inode in the rain and requires you to copy
| the file permissions.  Perl -i will do this for you but is even
| quicker with its rename and basically does a "mv A B && tail B > A"
| interspersing any incoming "date >> A" during the tail copy.  Example:
|  flushlines=`grep -n '^From ' $mbox | tail -$count | head -1 | cut -d: -f1`
|  test "$flushlines" -gt 0 && perl -ne $flushlines'<$.||print' -i $mbox
| So why don't we stream-edit such large files in place?  Unix files can
| be opened for simultaneous reading and writing with O_RDWR alias "r+"
| in fopen() <stdio.h> or "+<" in perlfunc open but unavailable in bash
| whose <> equals "w+" with O_TRUNC.  Proof of concept:
|  into='open(A,"+<".shift);while(<>){print A};truncate(A,tell A)'
|  test "$flushlines" -gt 0 && tail -$flushlines $mbox | perl -e "$into" $mbox
| This process fills A from the beginning without touching the end until
| A is processed completely.  There is only a minimal time slot between
| the read EOF and the written truncate susceptible to data losses.
| No extra inodes are put on disk nor temporary disk space needed.
| But I find both Perl and the one-liner too big for such a basic task
| and would prefer to abbreviate this into
|  tail -$flushlines $mbox | tee -qo $mbox
| Why so?  I found tee the simplest of all existing commands to redirect
| output into named files.  I found that I often don't need the cat-like
| extra standard output produced by tee and just bear it because
| tee file is easier to type than tee >file or tee file >/dev/null and
| therefore I suggest a new option tee -q file that is quiet on stdout.
| Furthermore I found that tee -a $mbox appends instead of overwriting
| and plain tee $mbox truncates $mbox before it was read.  Just like the
| sort file > file truncation dilemma is solved by sort file -o file I
| would love to get the nonsorting general-purpose GNU tee to overwrite
| with delayed truncation and therefore suggest the following patch


I like the new options.
Would you please make the following changes?

  - remove the short option names -o and -q.  They might conflict with short
      options in another version of tee or in some future standards spec.
      To use these new features, people will have to use the long options,
      --overwrite or --o.

      I.e., replace `'o'' in the long_options initializer list with
      where OVERWRITE_OPTION is defined like this

  - include diffs to doc/sh-utils.texi that describe the new options and
      give an example (I like the one above) showing how they're useful

  - fail if --overwrite is used but no file is specified,
      and add a line under usage()'s Usage: to reflect this. i.e.:

    Usage: tee [OPTION]... [FILE]...
      or:  tee [OPTION]... --overwrite FILE...

Please make your changes relative to the latest test release
and send `--unidiff' style diffs -- again to address@hidden


| *** sh-utils-2.0/src/tee.c    1999-07-26 09:09:42+02  2.0
| --- sh-utils-2.0/src/tee.c    2001-05-17 03:56:12+02
| *************** static int append;

reply via email to

[Prev in Thread] Current Thread [Next in Thread]