coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

split overwriting already existing files


From: Bernhard Voelker
Subject: split overwriting already existing files
Date: Thu, 03 Jul 2014 08:12:46 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0

Analyzing bug#17904, I came across the idea that split(1) could
possibly do something weird, i.e. delete the "aa" file, when
an output file already exists. Well split(1) doesn't delete it,
but rather overwrites it:

  $ wc -l file
  25000 file

  $ cp -p file file-newaa

  $ ls -log file*
  total 5864
  -rw-r--r-- 1 2999930 Jul  3 07:47 file
  -rw-r--r-- 1 2999930 Jul  3 07:47 file-newaa

  $ find . -size +1000 -exec ~/coreutils/src/split --verbose -l 10000 {\} 
{}-new \;
  creating file ‘./file-newaa-newaa’
  creating file ‘./file-newaa-newab’
  creating file ‘./file-newaa-newac’
  creating file ‘./file-newaa’
  creating file ‘./file-newab’
  creating file ‘./file-newac’

find(1) was obviously passing "file-newaa" first to split(1).
But the second split(1) run has silently overwritten the
already existing "file-newaa"!

  $ ls -log
  total 8796
  -rw-r--r-- 1 2999930 Jul  3 07:47 file
  -rw-r--r-- 1 1194980 Jul  3 07:48 file-newaa
  -rw-r--r-- 1 1194980 Jul  3 07:48 file-newaa-newaa
  -rw-r--r-- 1 1203284 Jul  3 07:48 file-newaa-newab
  -rw-r--r-- 1  601666 Jul  3 07:48 file-newaa-newac
  -rw-r--r-- 1 1203284 Jul  3 07:48 file-newab
  -rw-r--r-- 1  601666 Jul  3 07:48 file-newac

There's nothing explicitly about overwriting in the Texinfo manual,
but as it always says "the output file is created", I would assume
that O_CREAT is used.

This is what POSIX [1] says about the output files:

  [1] http://pubs.opengroup.org/onlinepubs/9699919799/utilities/split.html

  The output files contain portions of the original input file;
  otherwise, unchanged.

I'm not sure if that latter mandates to use O_CREAT, but I'd
consider failing here would be better than losing data.

Before looking into the code, do you think we should change this?

Have a nice day,
Berny



reply via email to

[Prev in Thread] Current Thread [Next in Thread]