bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort: memory exhausted with 50GB file


From: Jim Meyering
Subject: Re: sort: memory exhausted with 50GB file
Date: Sat, 26 Jan 2008 16:30:00 +0100

Leo Butler <address@hidden> wrote:

> < Paul Eggert <address@hidden> wrote:
> < ...
> < > Hmm, it sounds like your input data has some very long lines, then.
> < > That would explain at least part of your problem, then.  'sort' needs
> < > to keep at least two lines in main memory to compare them: if single
> < > input lines are many gigabytes long, then 'sort' must consume many
> < > gigabytes of memory, regardless of what parameter you specify with '-S'.
> <
> < You can run this to find the maximum line length:
> <
> <   wc --max-line-length your-data
...
> $ /usr/bin/wc -L /data/espace/k_400_a.out
> 107

That would have worked if your data really did have
the form you originally described.

With binary data, you have be careful.
E.g., translate all non-printable/space bytes to "."
before using wc -L:

  tr -c '[:print:][:space:]' '[.*]' < your-data | wc -L




reply via email to

[Prev in Thread] Current Thread [Next in Thread]