bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?


From: Peng Yu
Subject: Re: Why is `find -name '*.txt'` much slower than '*.txt' on glusterfs?
Date: Sat, 27 Jan 2018 10:39:01 -0600

> Is your find binary built with D_TYPE support?
>
>   $ find --version
>   find (GNU findutils) 4.6.0
>   Copyright (C) 2015 Free Software Foundation, Inc.
>   ...
>   Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION
> FTS(FTS_CWDFD) CBO(level=2)
> ____________________^^^^^^

$ find --version
find (GNU findutils) 4.6.0
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

Written by Eric B. Decker, James Youngman, and Kevin Dalley.
Features enabled: D_TYPE O_NOFOLLOW(enabled) LEAF_OPTIMISATION
FTS(FTS_CWDFD) CBO(level=2)

> Would you please try to reproduce this on a local file system, e.g. ext4?

It is much faster.

$ time find -maxdepth 1 -name '*.tsv' |wc -l
8026

real    0m0.106s
user    0m0.062s
sys    0m0.060s

> Finally, use "strace -v find ..." so that we see whether the 'getdents'
> system call returns D_TYPE information:

Here is the first 100 lines of the output of running `strace -ve
getdents find -maxdepth 1 -name '*.tsv'`.

https://pastebin.com/XxfFJJj4

>   $ strace -ve getdents  find -maxdepth 1 -name '*.tsv'
>   getdents(4, [{d_ino=4276237, d_off=4278742733963192100, d_reclen=24,
> d_name=".", d_type=DT_DIR},
>                {d_ino=4055085, d_off=8511941719133486354, d_reclen=24,
> d_name="..", d_type=DT_DIR},
>                {d_ino=4276239, d_off=9223372036854775807, d_reclen=24,
> d_name="file", d_type=DT_REG}],
>                32768) = 72
>   getdents(4, [], 32768)                  = 0
>
> In the end, it may turn out that either your 'find' binary is not compiled
> with D_TYPE support, or that glusterfs doesn't provide this information
> (and therefore find needs to invoke the additional newfstatat()s.

Let me know what case is it for my example.

-- 
Regards,
Peng



reply via email to

[Prev in Thread] Current Thread [Next in Thread]