[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Performance enhancement change - avoiding lstat()
From: |
James Youngman |
Subject: |
Re: Performance enhancement change - avoiding lstat() |
Date: |
Sun, 23 Jan 2005 10:54:42 +0000 |
User-agent: |
Mutt/1.3.28i |
On Sat, Jan 22, 2005 at 08:55:46PM +0100, Jim Meyering wrote:
> James Youngman <address@hidden> wrote:
> ...
> > Please let me know how you get on. I don't have any filesystems whose
> > metadata size is significantly greater than the size of the system
> > memory, so although I can see a far smaller number of lstat() calls, I
> > find it hard to pin down a measurable performance improvement.
>
> Wouldn't it be apparent on networked file systems like NFS and CODA?
You're right. It's an easy test to do, so I went ahead and did it. I
just performed the test using NFS mounted over the loopback network
interface, and there is a noticeable difference. My test filesystem
in this instance contains 32209 regular files, 3254 symbolic links,
and 1455 directories.
Using the profiling features of "strace -c" I find that the smaller
number of calls to lstat() do indeed represent a genuine saving in run
time:-
findutils 4.2.12:-
$ cat TRACE.prof.slow
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
79.59 2.924167 66 44196 lstat64
15.90 0.584020 134 4373 11 open
2.90 0.106544 37 2902 chdir
1.02 0.037443 12 3076 getdents64
0.36 0.013053 3 4362 close
0.10 0.003741 3 1459 fstat64
0.09 0.003179 2 1451 fcntl64
0.03 0.001132 71 16 write
0.01 0.000308 16 19 read
0.00 0.000096 5 18 brk
0.00 0.000060 12 5 mmap2
0.00 0.000057 7 8 old_mmap
0.00 0.000037 9 4 munmap
0.00 0.000012 6 2 2 access
0.00 0.000008 8 1 ioctl
0.00 0.000007 4 2 fchdir
0.00 0.000005 5 1 time
0.00 0.000005 5 1 uname
0.00 0.000004 4 1 set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00 3.673878 61897 13 total
Current development code with the enhancement enabled:-
$ cat TRACE.prof.opt
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
44.45 0.508704 116 4373 11 open
42.44 0.485672 39 12413 lstat64
8.14 0.093108 32 2902 chdir
3.24 0.037043 12 3076 getdents64
1.08 0.012413 3 4362 close
0.33 0.003752 3 1459 fstat64
0.27 0.003120 2 1451 fcntl64
0.02 0.000251 13 19 read
0.01 0.000157 10 16 write
0.01 0.000059 4 14 brk
0.00 0.000056 11 5 mmap2
0.00 0.000050 6 8 old_mmap
0.00 0.000036 9 4 munmap
0.00 0.000010 5 2 2 access
0.00 0.000007 7 1 ioctl
0.00 0.000007 4 2 fchdir
0.00 0.000006 6 1 uname
0.00 0.000005 5 1 time
0.00 0.000004 4 1 set_thread_area
------ ----------- ----------- --------- --------- ----------------
100.00 1.144460 30110 13 total
If then I disable the optimisation with the current development code,
I still get a 3.7 second runtime, with 44196 calls to lstat64(), so it
looks like there are no other effects muddying the waters.
I assume that any improvement which is visible using NFS over a
loopback network interface will be greater with NHS over a real
network, because of the additional latency.
I'm very happy with these results. I think I will make a findutils
release containing the enhancement code, but leave the enhancement
disabled by default. The optimisation can be enabled by specifying
"--enable-d_type-optimisation" to "configure".
Regards,
James.