bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?


From: Bernhard Voelker
Subject: Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
Date: Wed, 24 Jan 2018 08:39:25 +0100
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2

On 01/24/2018 01:44 AM, Peng Yu wrote:
The attached files are the strace results for `echo` and `find`. Can
anybody check if there is a way to improve the performance of `find`
so that it can work as efficient as `echo` in this test case? Thanks.

$ cat main.sh
#!/usr/bin/env bash
# vim: set noexpandtab tabstop=2:

echo *.txt
$ strace ./main.sh 2>/tmp/echo_strace.txt
$ strace find -name '*.txt' > /dev/null 2>/tmp/find_strace.txt

First of all, please refrain from attaching such huge files when
sending to mailing lists like this; either upload them to a web
paste bin, or at least compress the files, e.g. the larger file
could have wasted only <100k instead of 2.3M.  Thanks.

Regarding the strace outputs: you did neither of the tips of
James (use "strace -c ...") nor of Dale (use "find -maxdepth 1 ..."),
so just from the number of system calls one could already guess
that the time is spent by the newfstatat() calls.

We don't see what the previous getdents() calls return (strace -v),
but it seems that it doesn't include D_TYPE information on glusterfs.
Therefore, as you omitted the '-maxdepth 1' argument, find needs
to dig deeper to check if any of the entries have been a directory
(it would need to recurse to).

BTW: you already got the same answer on your cross-posting [1].
https://lists.gnu.org/r/coreutils/2018-01/msg00058.html

Have a nice day,
Berny






reply via email to

[Prev in Thread] Current Thread [Next in Thread]