bug-fileutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Memory consumption with cp -l (fwd)


From: Jim Meyering
Subject: Re: Memory consumption with cp -l (fwd)
Date: Mon, 10 Mar 2003 18:39:02 +0100

...
> Actually I had a hard time reproducing the bug on Computer 2: When I
> copied part of the original material the mem usage got large quickly. But
> when I copied the copy the mem usage was fairly low. This got me thinking:
> What is unique about the original? And there _is_ something unique: More
> than 99% of the files have > 10 hardlinks. It seems _this_ is the cause.

Thank you for investigating that!
Knowing the bit about hard links, the increased memory footprint is
understandable.  The additional memory usage comes from the part of copy.c
that I've included below.  The overhead is incurred only when the link
count is 2 or greater.

There's probably a way to save some space in cases like yours.
If there were(is?) a way to make a hard link given only a dev/inode pair,
we could save the destination dev/inode instead of the dest. file name.

Jim

--------------
  /* Associate the destination path with the source device and inode
     so that if we encounter a matching dev/ino pair in the source tree
     we can arrange to create a hard link between the corresponding names
     in the destination tree.

     Sometimes, when preserving links, we have to record dev/ino even
     though st_nlink == 1:
     - when using -H and processing a command line argument;
        that command line argument could be a symlink pointing to another
        command line argument.  With `cp -H --preserve=link', we hard-link
        those two destination files.
     - likewise for -L except that it applies to all files, not just
        command line arguments.

     Also record directory dev/ino when using --recursive.  We'll use that
     info to detect this problem: cp -R dir dir.  FIXME-maybe: ideally,
     directory info would be recorded in a separate hash table, since
     such entries are useful only while a single command line hierarchy
     is being copied -- so that separate table could be cleared between
     command line args.  Using the same hash table to preserve hard
     links means that it may not be cleared.  */

  if ((x->preserve_links
       && (1 < src_sb.st_nlink
           || (command_line_arg
               && x->dereference == DEREF_COMMAND_LINE_ARGUMENTS)
           || x->dereference == DEREF_ALWAYS))
      || (x->recursive && S_ISDIR (src_type)))
    {
      earlier_file = remember_copied (dst_path, src_sb.st_ino, src_sb.st_dev);
    }




reply via email to

[Prev in Thread] Current Thread [Next in Thread]