findutils-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Findutils-patches] new predicate


From: Eric Blake
Subject: Re: [Findutils-patches] new predicate
Date: Thu, 27 May 2010 16:08:02 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100430 Fedora/3.0.4-3.fc13 Lightning/1.0b1 Mnenhy/0.8.2 Thunderbird/3.0.4

On 05/27/2010 03:49 PM, Konrad Eisele wrote:
>> Personally, I'm a bit reluctant to add this patch, because you can
>> achieve the same effect with more efficient use of existing predicates:
>>
>> find <srcdir> -type f -exec sh -c \
>>   'file "$@" | sed -n "s/:.*text.*//p"' sh {} + > file.list
> 
> Now, thanks, I wasnt aware (or able to come up with)
> such a expression. For me this works well, my previous
> version would run forever, this now is usable. I guess
> that even if with my patch it would be faster and 
> simpler to type it would introduce dependencies
> to libmagic that might not be worth the effort.

You are right that findutils is a core part of pretty much every system,
while libmagic is not.  It is not necessarily fair to embedded systems
to make a program so essential as findutils depend on libmagic.  I'll
let others chime in on whether the feature is worth adding, perhaps
conditioned on a ./configure option, but I personally would need more
convincing.

> 
> Here is the results of when running it on the linux
> sourcetree:
> 
> time /usr/bin/find /usr/src/linux-2.6.29.6/ -type f -exec sh -c 'file "$@" | 
> sed -n "s/:.*text.*//p"' sh {} + | xargs file $1
> real    3m17.519s
> user    5m0.162s
> sys     0m6.233s
> 
> time /usr/bin/find /usr/src/linux-2.6.29.6/ -dtype .*text.*  | xargs file $1

The comparison is not quite fair, since you probably want to use or omit
'-type f' equally between the two runs.

> real    1m56.629s
> user    3m9.618s
> sys     0m3.565s

Are you taking file system caching effects into account here?  Perhaps
the second run was faster merely because the cache was hot?  One other
thing to benchmark would be the use of -execdir instead of -exec; while
it probably spawns more processes, each process is localized to one
directory rather than scattered over multiple locations, so there may be
some efficiency gains to offset the extra processes.

-- 
Eric Blake   address@hidden    +1-801-349-2682
Libvirt virtualization library http://libvirt.org

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]