bug-findutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Is there a memory leak in find ?


From: Paul E Condon
Subject: Is there a memory leak in find ?
Date: Tue, 8 Apr 2014 13:38:57 -0600
User-agent: Mutt/1.5.21 (2010-09-15)

This is my first email to this list, so don't hesitate to correct me when I
violate the local etiquette.

I am attempting to use find in a way for which I can find little prior
art.  I have a huge file structure on a 3 terabyte hard drive. I want
to generate a single file that lists all the plane files, directory
files, and symlinks in the whole structure. I don't care how long it
takes, but I do want it to terminate without error. When I run it, I
can monitor memory usage using Gnome-system-monitor and I can see the
memory usage slowly rising. 

I am using Debian Wheezy on an oldish HP desktop tower with Pentium
dual core CPU. I have two of these 3terabyte disks and several 500gigabyte
disk, all with USB2 interfaces. My project is an attempt at a massive
at home data de-duplication of backup data structures that have accumulated
over time. Are there difficulties with coreutils on these big disks? My 
program is a script written in bash. It runs for about an hour and quits
without any error message that I can find, at a point where it is deep in
a very deep tree traversal, and as I say. with memory usage growing. 

Some tips on gathering debug data would be much appreciated.
The script follows:
<begin>
#!/bin/bash
this=bld-spcl
# for use in grand cleanup on disk, gfx2 starting 20140407,pec
#
# 
###############################
. /root/bin/arx-declares
export arxiv="$PWD"


find glbl -warn -depth -mindepth 1                                    \(    \
    \( -type f -empty -printf "mdx f e hst etm ${fndnfmt}\n"       \) -o    \
    \( -type f        -printf "mdx f f hst etm ${fndnfmt}\n"       \) -o    \
    \( -type d        -printf "mdx f d hst etm ${fndnfmt}\n"       \) -o    \
    \( -type l        -printf "mdx f l hst etm ${fndnfmt} -> %l\n" \) \) |& \
    sed 's|\.\([0-9]\)\{10\} glbl| glbl|' > "$arxiv/find.out_$('date' 
$TIME_STYLE)" 2>&1
exit
#####################################
# mnemonic field names for $fndnfmt: 
#                           1   2   3   4   5   6   7   8
#      mdx fqx dfl hst etm nod mod lnx usr grp siz mtm fqnm
#       1   2   3   4   5   6   7   8   9  10  11   12  13
##########################################
<eof>

Relavant declarations from arx-declares are:
export TIME_STYLE='+%Y%m%d_%H%M%S'
export fndnfmt="%i %M %n %U %G %s %TY%Tm%Td_%TH%TM%TS %p"

The directory, glbl, is the root of the file tree that I am trying to
exhaustively search. The complication with sed is to remove the subsecond
decimal fractions from the output. It works correctly for a few million
times, before the time of the crash. The fixed data are placeholders to
data that will be filled in by future processing of this output file,
if/when I get this script working.

Want more information? Ask, please.

-- 
Paul E Condon           
address@hidden




reply via email to

[Prev in Thread] Current Thread [Next in Thread]