[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Why the memory usage of sort does not seem to increase as the input
From: |
Pádraig Brady |
Subject: |
Re: Why the memory usage of sort does not seem to increase as the input file size increases? |
Date: |
Mon, 26 May 2014 20:53:35 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130110 Thunderbird/17.0.2 |
On 05/26/2014 07:45 PM, Peng Yu wrote:
>> Sort takes a divide and conquer approach,
>> by sorting parts of the input to temporary files,
>> and then merging the results with a bounded amount of memory.
>>
>> sort currently defaults to using a large memory buffer
>> to minimize overhead associated with writing and reading
>> temp files, so you may be seeing just this large memory
>> allocation each time.
>>
>> The memory allocation can be controlled with --buffer-size
>
> If I have enough memory, is it always faster to sort without using
> temp files. How to force sort always use memory only? Thanks.
Traditionally there were mainly two levels in the memory hierarchy,
and so it was best to use as much RAM as possible. However given the
relative increase in performance and size of processor cache compared to RAM,
it can often depending on the operation be much more performant to deal
with sizes that will fit within a cache (line). However as the following
demonstrates, sort(1) currently seems to access RAM in a cache efficient manner,
since the smaller working set sizes that would fit entirely within L3 cache
on the test machine do not out perform those using larger RAM buffers.
Let's do a quick test.
$ shuf -i1-5000000 > file.in # generate test data
$ unset MALLOC_PERTURB_ # This has a large overhead for large buffers
First with a single thread as a base line. Note we put tmp files
in an existing RAM disk to avoid disk latencies.
$ time TMPDIR=/dev/shm sort --parallel=1 <file.in >/dev/null # uses about 200MB
real 0m23.357s
user 0m22.670s
sys 0m0.586s
So let's run again with a size smaller than my 3MB L3 cache.
$ time TMPDIR=/dev/shm sort --parallel=1 -S2M < file.in> /dev/null # uses about
2MB
real 0m24.033s
user 0m23.808s
sys 0m0.128s
So much the same, the overhead probably due to the I/O to /tmp
For kicks let's run again allowing it to use as much RAM as it needs,
but also as much threads as appropriate for the current system.
Note at the end of the process the RAM usage spikes to 500MB,
but we can see significant performance increase due to the extra cores.
$ time TMPDIR=/dev/shm sort <file.in >/dev/null # uses about 500MB
real 0m11.671s
user 0m35.567s
sys 0m2.793s
Now using 500MB can have significant impact on the system
and sort auto sizes the mem buffer based on the current
amount of free RAM, though this is not ideal given the
length that sort can run.
Note if you limit the RAM used with -S, then you
also effectively limit the amount of threads that will be used.
cheers,
Pádraig.