findutils-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Findutils-patches] new predicate


From: Konrad Eisele
Subject: Re: [Findutils-patches] new predicate
Date: Fri, 28 May 2010 09:34:49 +0200

-------- Original-Nachricht --------
> Datum: Thu, 27 May 2010 16:08:02 -0600
> Von: Eric Blake <address@hidden>
> An: Konrad Eisele <address@hidden>
> CC: address@hidden
> Betreff: Re: [Findutils-patches] new predicate

> On 05/27/2010 03:49 PM, Konrad Eisele wrote:
> >> Personally, I'm a bit reluctant to add this patch, because you can
> >> achieve the same effect with more efficient use of existing predicates:
> >>
> >> find <srcdir> -type f -exec sh -c \
> >>   'file "$@" | sed -n "s/:.*text.*//p"' sh {} + > file.list
> > 
> > Now, thanks, I wasnt aware (or able to come up with)
> > such a expression. For me this works well, my previous
> > version would run forever, this now is usable. I guess
> > that even if with my patch it would be faster and 
> > simpler to type it would introduce dependencies
> > to libmagic that might not be worth the effort.
> 
> You are right that findutils is a core part of pretty much every system,
> while libmagic is not.  It is not necessarily fair to embedded systems
> to make a program so essential as findutils depend on libmagic.  I'll
> let others chime in on whether the feature is worth adding, perhaps
> conditioned on a ./configure option, but I personally would need more
> convincing.
> 
> > 
> > Here is the results of when running it on the linux
> > sourcetree:
> > 
> > time /usr/bin/find /usr/src/linux-2.6.29.6/ -type f -exec sh -c 'file
> "$@" | sed -n "s/:.*text.*//p"' sh {} + | xargs file $1
> > real    3m17.519s
> > user    5m0.162s
> > sys     0m6.233s
> > 
> > time /usr/bin/find /usr/src/linux-2.6.29.6/ -dtype .*text.*  | xargs
> file $1
> 
> The comparison is not quite fair, since you probably want to use or omit
> '-type f' equally between the two runs.
> 
> > real    1m56.629s
> > user    3m9.618s
> > sys     0m3.565s
> 
> Are you taking file system caching effects into account here?  Perhaps
> the second run was faster merely because the cache was hot?  One other
> thing to benchmark would be the use of -execdir instead of -exec; while
> it probably spawns more processes, each process is localized to one
> directory rather than scattered over multiple locations, so there may be
> some efficiency gains to offset the extra processes.

I fiddled around a bit and found the "file -f -" invocation of 
file where the filelist is piped in. The following version
is therefore as fast as my custom "-dtype" file patch:

find <srcdir> | file -F 0xdedbe -f - | sed -n "s/0xdedbe.*text.*//p"

Of course it would be easier to have the "file" funtionality
directly accessible in "find"....

> 
> -- 
> Eric Blake   address@hidden    +1-801-349-2682
> Libvirt virtualization library http://libvirt.org
> 

-- 
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01



reply via email to

[Prev in Thread] Current Thread [Next in Thread]