[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Findutils-patches] new predicate
From: |
Konrad Eisele |
Subject: |
Re: [Findutils-patches] new predicate |
Date: |
Fri, 28 May 2010 09:34:49 +0200 |
-------- Original-Nachricht --------
> Datum: Thu, 27 May 2010 16:08:02 -0600
> Von: Eric Blake <address@hidden>
> An: Konrad Eisele <address@hidden>
> CC: address@hidden
> Betreff: Re: [Findutils-patches] new predicate
> On 05/27/2010 03:49 PM, Konrad Eisele wrote:
> >> Personally, I'm a bit reluctant to add this patch, because you can
> >> achieve the same effect with more efficient use of existing predicates:
> >>
> >> find <srcdir> -type f -exec sh -c \
> >> 'file "$@" | sed -n "s/:.*text.*//p"' sh {} + > file.list
> >
> > Now, thanks, I wasnt aware (or able to come up with)
> > such a expression. For me this works well, my previous
> > version would run forever, this now is usable. I guess
> > that even if with my patch it would be faster and
> > simpler to type it would introduce dependencies
> > to libmagic that might not be worth the effort.
>
> You are right that findutils is a core part of pretty much every system,
> while libmagic is not. It is not necessarily fair to embedded systems
> to make a program so essential as findutils depend on libmagic. I'll
> let others chime in on whether the feature is worth adding, perhaps
> conditioned on a ./configure option, but I personally would need more
> convincing.
>
> >
> > Here is the results of when running it on the linux
> > sourcetree:
> >
> > time /usr/bin/find /usr/src/linux-2.6.29.6/ -type f -exec sh -c 'file
> "$@" | sed -n "s/:.*text.*//p"' sh {} + | xargs file $1
> > real 3m17.519s
> > user 5m0.162s
> > sys 0m6.233s
> >
> > time /usr/bin/find /usr/src/linux-2.6.29.6/ -dtype .*text.* | xargs
> file $1
>
> The comparison is not quite fair, since you probably want to use or omit
> '-type f' equally between the two runs.
>
> > real 1m56.629s
> > user 3m9.618s
> > sys 0m3.565s
>
> Are you taking file system caching effects into account here? Perhaps
> the second run was faster merely because the cache was hot? One other
> thing to benchmark would be the use of -execdir instead of -exec; while
> it probably spawns more processes, each process is localized to one
> directory rather than scattered over multiple locations, so there may be
> some efficiency gains to offset the extra processes.
I fiddled around a bit and found the "file -f -" invocation of
file where the filelist is piped in. The following version
is therefore as fast as my custom "-dtype" file patch:
find <srcdir> | file -F 0xdedbe -f - | sed -n "s/0xdedbe.*text.*//p"
Of course it would be easier to have the "file" funtionality
directly accessible in "find"....
>
> --
> Eric Blake address@hidden +1-801-349-2682
> Libvirt virtualization library http://libvirt.org
>
--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01