bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #54860] Less performance of -execdir echo {} +


From: Bernhard Voelker
Subject: [bug #54860] Less performance of -execdir echo {} +
Date: Tue, 23 Oct 2018 18:18:33 -0400 (EDT)
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0

Update of bug #54860 (project findutils):

                Severity:              3 - Normal => 1 - Wish               

    _______________________________________________________

Follow-up Comment #7:

Ha, it's the *directory order* which plays into the game!
That means, 'dir-2/another' has been created *later than* 'dir-2/dir-4':


mkdir -pv dir dir-2 dir-3/dir-{4,5}   \
  && touch dir/{bar,one}.c dir-2/file dir-2/foo.c dir-3/dir-4/file
           dir-3/dir-5/file dir-3/file file \
  && mkdir dir-2/dir-4 dir-2/dir-5 \
  && touch dir-2/another.c \
  && touch dir-2/dir-4/file dir-2/dir-5/file


The underlying system call 'getdents' returns the directory entries of
'dir-2'
un-ordered, i.e., not sorted alphabetically but usually in the order of
creation:


getdents(6, [{d_ino=90672, d_off=1, d_reclen=24, d_name=".", d_type=DT_DIR},
             {d_ino=91825, d_off=2, d_reclen=24, d_name="..", d_type=DT_DIR},
             {d_ino=92462, d_off=3, d_reclen=32, d_name="another.c",
d_type=DT_REG},
             {d_ino=90677, d_off=4, d_reclen=32, d_name="dir-5",
d_type=DT_DIR},
             {d_ino=90676, d_off=5, d_reclen=32, d_name="dir-4",
d_type=DT_DIR},
             {d_ino=91934, d_off=6, d_reclen=32, d_name="foo.c",
d_type=DT_REG},
             {d_ino=91933, d_off=7, d_reclen=24, d_name="file",
d_type=DT_REG}], 32768) = 200

_find_ will process all entries exactly in that order, and between
'another.c'
and 'foo.c' it processes the subdirectories dir-5 and dir-4.  As it has
changed directory in between (internally), 'foo.c' is processed in its
own -execdir run.

There is also a comment on this in the code
<https://git.sv.gnu.org/cgit/findutils.git/tree/find/ftsfind.c?id=7741d79fa3f5#n563>:

/* If we changed level, perform any outstanding
 * execdirs.  If we see a sequence of directory entries
 * like this: fffdfffdfff, we could build a command line
 * of 9 files, but this simple-minded implementation
 * builds a command line for only 3 files at a time
 * (since fts descends into the directories).
 */


After all, _find_ works as intended (by the implementation).
I'm therefore marking the severity as 'wishlist'.

Of course, we can continue the discussion, and maybe someone (even me?)
may propose a patch to improve things.

The problem in that is that the whole point about using gnulib's FTS module
is that we don't have to care whether a directory has 2 entries or
4 million; in the latter case (or even larger directories) sorting is not
possible anymore.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?54860>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]