From 882a6b0f36ae78252e1a384414a9728e07017ab7 Mon Sep 17 00:00:00 2001 From: Diego Ongaro Date: Tue, 18 Aug 2020 16:51:53 -0700 Subject: [PATCH 2/3] find: Update docs for -s (sort) --- TODO | 12 ++---------- doc/find.texi | 24 ++++++++++++++++++++++-- find/find.1 | 12 +++++++++++- 3 files changed, 35 insertions(+), 13 deletions(-) diff --git a/TODO b/TODO index 6f0a5536..3760b7f4 100644 --- a/TODO +++ b/TODO @@ -5,16 +5,8 @@ * man page for frcode Perhaps a better description in texi pages as well. -* Add option for find to sort output in lexical order for use for updatedb -olarsac@airfrance.fr (Olivier) made the following suggestion: - -As I was running thru the code looking for the bug I wondered why the updatedb -has to use sort... -why not add an option to find that sorts the output in lexical order? -my point is: -- sort on a big list is costly (here we do locate on big big file system) -- find may (in theory) sort incrementally very easily by sorting only the current -directory entries before recursion +* Make updatedb use find -s (sort) where available, as suggested by +olarsac@airfrance.fr (Olivier) long ago. * Include example of use of updatedb in documentation. Use something close to the Debian daily cron job. diff --git a/doc/find.texi b/doc/find.texi index ce63ca52..ebc8f8ee 100644 --- a/doc/find.texi +++ b/doc/find.texi @@ -3258,7 +3258,7 @@ discussed in this manual. @section Invoking @code{find} @example -find @r{[-H] [-L] [-P] [-D @var{debugoptions}] [-O@var{level}]} @r{[}@var{file}@dots{}@r{]} @r{[}@var{expression}@r{]} +find @r{[-H] [-L] [-P] [-s] [-D @var{debugoptions}] [-O@var{level}]} @r{[}@var{file}@dots{}@r{]} @r{[}@var{expression}@r{]} @end example @code{find} searches the directory tree rooted at each file name @@ -3266,7 +3266,7 @@ find @r{[-H] [-L] [-P] [-D @var{debugoptions}] [-O@var{level}]} @r{[}@var{file}@ the tree. The command line may begin with the @samp{-H}, @samp{-L}, @samp{-P}, -@samp{-D} and @samp{-O} options. These are followed by a list of +@samp{-s}, @samp{-D} and @samp{-O} options. These are followed by a list of files or directories that should be searched. If no files to search are specified, the current directory (@file{.}) is used. @@ -3330,6 +3330,26 @@ broken), it falls back on using the properties of the symbolic link itself. @ref{Symbolic Links} for a more complete description of how symbolic links are handled. +The @samp{-s} option causes @code{find} to process files within each directory +in sorted order by name, as defined by the current locale. Without this, +@code{find} processes files in unspecified order. + +The exact ordering is determined by the @code{LC_COLLATE} setting in the +current locale. To sort by byte order, use @code{LC_COLLATE=C}. + +The @samp{-s} option is more efficient than piping large amounts of output of +@code{find} into the @code{sort} command, and it produces output incrementally +rather than buffering it all. It's also more convenient when the output of a +@code{find} command isn't line-oriented or the lines don't start with the +filenames. + +Note that the output of @code{find} with @samp{-s} may differ from that of +piping into the @code{sort} command. For example, @code{sort} isn't aware that +``/'' separates directories and may ouptut ``foo.baz'' before ``foo/bar'' +(depending on the locale). However, @code{find} with @samp{-s} will always +process ``foo'' and its children first, since the name of the directory ``foo'' +always sorts before the name ``foo.baz``. + @node Warning Messages @subsection Warning Messages diff --git a/find/find.1 b/find/find.1 index 45895158..ca65d0ff 100644 --- a/find/find.1 +++ b/find/find.1 @@ -53,11 +53,12 @@ instead, anyway). This manual page talks about `options' within the expression list. These options control the behaviour of .B find -but are specified immediately after the last path name. The five +but are specified immediately after the last path name. The six `real' options .BR \-H , .BR \-L , .BR \-P , +.BR \-s , .B \-D and .B \-O @@ -214,6 +215,10 @@ is, any symbolic links appearing after on the command line will be dereferenced, and those before it will not). +.IP \-s +Process files within each directory in sorted order by name, as defined by the +current locale. Without this, \fBfind\fR processes files in unspecified order. + .IP "\-D debugopts" Print diagnostic information; this can be helpful to diagnose problems with why @@ -1992,6 +1997,10 @@ interpret the response to .BR \-ok , the interpretation of any bracket expressions in the pattern will be affected by `LC_COLLATE'. +With the +.B \-s +option, the `LC_COLLATE` environment variable will also affect the order that +files are processed within each directory. .IP LC_CTYPE This variable affects the treatment of character classes used in @@ -2381,6 +2390,7 @@ exit status was unaffected by the failure of .TS l l l . Feature Added in Also occurs in +\-s unreleased BSD \-newerXY 4.3.3 BSD \-D 4.3.1 \-O 4.3.1 -- 2.27.0