bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sort: memory exhausted with 50GB file


From: Paul Eggert
Subject: Re: sort: memory exhausted with 50GB file
Date: Fri, 25 Jan 2008 16:20:53 -0800
User-agent: Gnus/5.11 (Gnus v5.11) Emacs/22.1 (gnu/linux)

Leo Butler <address@hidden> writes:

> I don't know if this is relevant, but I have extracted the 2nd through 1000th 
> character in the 50GB file, and there appears to be garbage (unprintable 
> chars) 
> in the first line. The remainder of the extract looks fine. Moreover, I split 
> the file into 500MB chunks, sorted these and then merge sorted the pairs. It 
> appears that the 500MB chunks produced by split have been stripped of '\n' 
> and 
> are garbage, as are the sorted files.

Hmm, it sounds like your input data has some very long lines, then.
That would explain at least part of your problem, then.  'sort' needs
to keep at least two lines in main memory to compare them: if single
input lines are many gigabytes long, then 'sort' must consume many
gigabytes of memory, regardless of what parameter you specify with '-S'.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]