coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: "ls -l": Avoid unnecessary getxattr() overhead


From: Jim Meyering
Subject: Re: "ls -l": Avoid unnecessary getxattr() overhead
Date: Fri, 17 Feb 2012 13:11:41 +0100

Bernhard Voelker wrote:
> On 02/17/2012 11:35 AM, Jim Meyering wrote:
>> Bernhard Voelker wrote:
>>> Alternatively, it could write a certain line to stdout or stderr,
>>> and for comparison, this pattern would have to be counted.
>
>> Thanks for the patch.
>> However, while I don't particularly like using fd 33, and might
>> use stderr instead, I do want to avoid stdout.  With it, there seems
>> to be too much risk that mixing wrapper-induced output with ls's normal
>> output would cause trouble, e.g., if your __SPEEDUP__ token were not
>> written right after a newline.
>>
>> Instead, I've realized that I can eliminate the need for the wc post-
>> processing by using the fact that our contrived output file is seekable:
>> rather than simply writing to fd, first seek to position 0 each time,
>> and write the count:
>>
>> Here's a stand-alone test script.
>> When it completes, the output file, "x"
>> contains either nothing or a count of the number of calls.
>>
> ...
>> Hmm... that means the case of no getxattr calls would still
>> require a little special handling to map "empty file" to "0".
>> With that, I still prefer the simpler, original write_1 function.
>
> I agree.
> We could maybe avoid the sed+mv by using a preprocessor define,
> similar as I did in my previous patch:
>
>
> diff --git a/tests/ls/getxattr-speedup b/tests/ls/getxattr-speedup
> index f3dc5bb..58fd90a 100755
> --- a/tests/ls/getxattr-speedup
> +++ b/tests/ls/getxattr-speedup
> @@ -32,19 +32,16 @@ cat > k.c <<'EOF' || framework_failure_
>  static void
>  write_1 (void)
>  {
> -  int fd = @FD@;
> -  write (fd, "x\n", 2);
> +  write (FD, "x\n", 2);
>  }
>  ssize_t getxattr (const char *path, const char *name, void *value, size_t 
> size)
>  { write_1 (); errno = ENOTSUP; return -1; }
>  ssize_t lgetxattr(const char *path, const char *name, void *value, size_t 
> size)
>  { write_1 (); errno = ENOTSUP; return -1; }
>  EOF
> -sed "s/@FD@/$fd/" k.c > k || framework_failure_
> -mv k k.c || framework_failure_
>
>  # Then compile/link it:
> -$CC -fPIC -O2 -c k.c || framework_failure_ 'failed to compile with -fPIC'
> +$CC -DFD=$fd -fPIC -O2 -c k.c || framework_failure_ 'failed to
> compile with -fPIC'
>  ld -G k.o -o k.so || framework_failure_ 'failed to invoke ld -G ...'

Definitely an improvement.  Thank you!

> BTW: I have another question: what if someone has used
> "./configure --disable-xattr"? ... surprisingly nothing:
> ls doesn't honor USE_XATTR. Is that correct?

That option is intended to disable copy.c's attribute-copying code.
I vaguely recall that initial libxattr implementations had problems.
It would not be feasible (or sensible, imho) to eliminate all getxattr
uses from ls.c.

Thanks again for all of the review comments.
Here's the updates series, along with this addition to NEWS:

** Improvements

  ls can be much more efficient, especially with large directories on file
  systems for which getfilecon-, ACL-check- and XATTR-check-induced syscalls
  fail with ENOTSUP or similar.


>From 783f67f9ba1823414e8ff12492f5a73aeb93a627 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Sat, 11 Feb 2012 12:36:57 +0100
Subject: [PATCH 1/3] ls: optimize for when getfilecon would often fail (~33%
 perf. gain)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

On systems or file systems without SELinux support, all getfilecon
and lgetfilecon calls would fail due to lack of support.  We can non-
invasively cache such failure (on most recently accessed device) and
avoid the vast majority of the failing underlying getxattr syscalls.
* src/ls.c (errno_unsupported): New function.
(selinux_challenged_device): New file-scoped global.
(getfilecon_cache, lgetfilecon_cache): New error-caching wrapper
functions.
(gobble_file): Use the caching wrappers, for when many *getfilecon
calls would fail with ENOTSUP or EOPNOTSUPP.
Suggested by Sven Breuner in
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/2187
Improved-by: Pádraig Brady.
---
 src/ls.c |   36 ++++++++++++++++++++++++++++++++----
 1 file changed, 32 insertions(+), 4 deletions(-)

diff --git a/src/ls.c b/src/ls.c
index f5cd37a..8a9cb41 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2777,10 +2777,40 @@ clear_files (void)
   file_size_width = 0;
 }

+/* Return true if ERR implies lack-of-support failure by a
+   getxattr-calling function like getfilecon.  */
+static bool
+errno_unsupported (int err)
+{
+  return err == ENOTSUP || err == EOPNOTSUPP;
+}
+
+/* Cache *getfilecon failure, when it's trivial to do so.
+   Like getfilecon/lgetfilecon, but when F's st_dev says it's on a known-
+   SELinux-challenged file system, fail with ENOTSUP immediately.  */
+static int
+getfilecon_cache (char const *file, struct fileinfo *f, bool deref)
+{
+  /* st_dev of the most recently processed device for which we've
+     found that [l]getfilecon fails indicating lack of support.  */
+  static dev_t unsupported_device;
+
+  if (f->stat.st_dev == unsupported_device)
+    {
+      errno = ENOTSUP;
+      return -1;
+    }
+  int r = (deref
+           ? getfilecon (file, &f->scontext)
+           : lgetfilecon (file, &f->scontext));
+  if (r < 0 && errno_unsupported (errno))
+    unsupported_device = f->stat.st_dev;
+  return r;
+}
+
 /* Add a file to the current table of files.
    Verify that the file exists, and print an error message if it does not.
    Return the number of blocks that the file occupies.  */
-
 static uintmax_t
 gobble_file (char const *name, enum filetype type, ino_t inode,
              bool command_line_arg, char const *dirname)
@@ -2918,9 +2948,7 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,
         {
           bool have_selinux = false;
           bool have_acl = false;
-          int attr_len = (do_deref
-                          ?  getfilecon (absolute_name, &f->scontext)
-                          : lgetfilecon (absolute_name, &f->scontext));
+          int attr_len = getfilecon_cache (absolute_name, f, do_deref);
           err = (attr_len < 0);

           if (err == 0)
--
1.7.9.1.245.gbbe07


>From 14abb248030e1a708170d25957159598e2892165 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Mon, 13 Feb 2012 12:05:40 +0100
Subject: [PATCH 2/3] ls: cache ACL- and CAP-querying syscall failures

Like the optimization to avoid always-failing getfilecon calls,
this change avoids always-failing queries for whether a file has
a nontrivial ACL and for whether a file has certain "capabilities".
When such a query fails for one file (indicating no support), we know it
will always fail that way for the affected device.  With this change, we
have thus eliminated nearly all failing-unsupported getxattr syscalls.
* src/ls.c (has_capability) [!HAVE_CAP]: Set errno to ENOTSUP.
(errno_unsupported): Expand the list of E* errno values to match
that of lib/acl-internal.h's ACL_NOT_WELL_SUPPORTED macro.
(file_has_acl_cache, has_capability_cache): New functions.
(gobble_file): Use them in place of non-caching ones.
* NEWS (Improvements): Mention it.
Suggested by Sven Breuner in
http://thread.gmane.org/gmane.comp.gnu.coreutils.general/2187
While eliminating most getfilecon calls saved about 33%,
eliminating these other calls can save almost all of the
remaining ~67% cost, on some remote file systems.
---
 NEWS     |    6 ++++++
 src/ls.c |   61 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 63 insertions(+), 4 deletions(-)

diff --git a/NEWS b/NEWS
index 2080790..133f8fe 100644
--- a/NEWS
+++ b/NEWS
@@ -22,6 +22,12 @@ GNU coreutils NEWS                                    -*- 
outline -*-
   surprising rename no-op behavior).  Now, mv handles this case by skipping
   the usually-useless rename and simply unlinking A.

+** Improvements
+
+  ls can be much more efficient, especially with large directories on file
+  systems for which getfilecon-, ACL-check- and XATTR-check-induced syscalls
+  fail with ENOTSUP or similar.
+

 * Noteworthy changes in release 8.15 (2012-01-06) [stable]

diff --git a/src/ls.c b/src/ls.c
index 8a9cb41..92b17a4 100644
--- a/src/ls.c
+++ b/src/ls.c
@@ -2736,6 +2736,7 @@ has_capability (char const *name)
 static bool
 has_capability (char const *name ATTRIBUTE_UNUSED)
 {
+  errno = ENOTSUP;
   return false;
 }
 #endif
@@ -2778,11 +2779,16 @@ clear_files (void)
 }

 /* Return true if ERR implies lack-of-support failure by a
-   getxattr-calling function like getfilecon.  */
+   getxattr-calling function like getfilecon or file_has_acl.  */
 static bool
 errno_unsupported (int err)
 {
-  return err == ENOTSUP || err == EOPNOTSUPP;
+  return (err == EBUSY
+          || err == EINVAL
+          || err == ENOENT
+          || err == ENOSYS
+          || err == ENOTSUP
+          || err == EOPNOTSUPP);
 }

 /* Cache *getfilecon failure, when it's trivial to do so.
@@ -2808,6 +2814,53 @@ getfilecon_cache (char const *file, struct fileinfo *f, 
bool deref)
   return r;
 }

+/* Cache file_has_acl failure, when it's trivial to do.
+   Like file_has_acl, but when F's st_dev says it's on a file
+   system lacking ACL support, return 0 with ENOTSUP immediately.  */
+static int
+file_has_acl_cache (char const *file, struct fileinfo *f)
+{
+  /* st_dev of the most recently processed device for which we've
+     found that file_has_acl fails indicating lack of support.  */
+  static dev_t unsupported_device;
+
+  if (f->stat.st_dev == unsupported_device)
+    {
+      errno = ENOTSUP;
+      return 0;
+    }
+
+  /* Zero errno so that we can distinguish between two 0-returning cases:
+     "has-ACL-support, but only a default ACL" and "no ACL support". */
+  errno = 0;
+  int n = file_has_acl (file, &f->stat);
+  if (n <= 0 && errno_unsupported (errno))
+    unsupported_device = f->stat.st_dev;
+  return n;
+}
+
+/* Cache has_capability failure, when it's trivial to do.
+   Like has_capability, but when F's st_dev says it's on a file
+   system lacking capability support, return 0 with ENOTSUP immediately.  */
+static bool
+has_capability_cache (char const *file, struct fileinfo *f)
+{
+  /* st_dev of the most recently processed device for which we've
+     found that has_capability fails indicating lack of support.  */
+  static dev_t unsupported_device;
+
+  if (f->stat.st_dev == unsupported_device)
+    {
+      errno = ENOTSUP;
+      return 0;
+    }
+
+  bool b = has_capability (file);
+  if ( !b && errno_unsupported (errno))
+    unsupported_device = f->stat.st_dev;
+  return b;
+}
+
 /* Add a file to the current table of files.
    Verify that the file exists, and print an error message if it does not.
    Return the number of blocks that the file occupies.  */
@@ -2942,7 +2995,7 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,
       /* Note has_capability() adds around 30% runtime to 'ls --color'  */
       if ((type == normal || S_ISREG (f->stat.st_mode))
           && print_with_color && is_colored (C_CAP))
-        f->has_capability = has_capability (absolute_name);
+        f->has_capability = has_capability_cache (absolute_name, f);

       if (format == long_format || print_scontext)
         {
@@ -2967,7 +3020,7 @@ gobble_file (char const *name, enum filetype type, ino_t 
inode,

           if (err == 0 && format == long_format)
             {
-              int n = file_has_acl (absolute_name, &f->stat);
+              int n = file_has_acl_cache (absolute_name, f);
               err = (n < 0);
               have_acl = (0 < n);
             }
--
1.7.9.1.245.gbbe07


>From e1951a002eae92b33ea3310a394a065858670468 Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Mon, 13 Feb 2012 12:52:19 +0100
Subject: [PATCH 3/3] tests: test for ls speed-up

* tests/ls/getxattr-speedup: New test.
* tests/Makefile.am (TESTS): Add it.
Improved-by: Bernhard Voelker.
---
 tests/Makefile.am         |    1 +
 tests/ls/getxattr-speedup |   58 +++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 59 insertions(+)
 create mode 100755 tests/ls/getxattr-speedup

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 7b53681..625b636 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -426,6 +426,7 @@ TESTS =                                             \
   ls/dired                                     \
   ls/file-type                                 \
   ls/follow-slink                              \
+  ls/getxattr-speedup                          \
   ls/infloop                                   \
   ls/inode                                     \
   ls/m-option                                  \
diff --git a/tests/ls/getxattr-speedup b/tests/ls/getxattr-speedup
new file mode 100755
index 0000000..ecdd126
--- /dev/null
+++ b/tests/ls/getxattr-speedup
@@ -0,0 +1,58 @@
+#!/bin/sh
+# Show that we've eliminated most of ls' failing getxattr syscalls,
+# regardless of how many files are in a directory we list.
+# This test is skipped on systems that lack LD_PRELOAD support; that's fine.
+
+# Copyright (C) 2012 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+. "${srcdir=.}/init.sh"; path_prepend_ ../src
+print_ver_ ls
+
+fd=33
+
+# Replace each getxattr and lgetxattr call with a call to these stubs.
+# Each writes a single line to the specified FD so that we can later
+# determine the total number of calls.  FIXME: there must be a better way.
+cat > k.c <<'EOF' || framework_failure_
+#include <errno.h>
+#include <unistd.h>
+static void
+write_1 (void)
+{
+  write (FD, "x\n", 2);
+}
+ssize_t getxattr (const char *path, const char *name, void *value, size_t size)
+{ write_1 (); errno = ENOTSUP; return -1; }
+ssize_t lgetxattr(const char *path, const char *name, void *value, size_t size)
+{ write_1 (); errno = ENOTSUP; return -1; }
+EOF
+
+# Then compile/link it:
+$CC -DFD=$fd -fPIC -O2 -c k.c \
+  || framework_failure_ 'failed to compile with -fPIC'
+ld -G k.o -o k.so || framework_failure_ 'failed to invoke ld -G ...'
+
+# create a few files
+seq 100 | xargs touch || framework_failure_
+
+# Finally, run the test, redirecting
+eval "LD_PRELOAD=$PWD/k.so ls --color=always -l . $fd>x" || fail=1
+
+# Ensure that there were no more than 3 *getxattr calls.
+n_calls=$(wc -l < x)
+test "$n_calls" -le 3 || fail=1
+
+Exit $fail
--
1.7.9.1.245.gbbe07



reply via email to

[Prev in Thread] Current Thread [Next in Thread]