[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
sort/merge exponential slow down
From: |
paul . matthews |
Subject: |
sort/merge exponential slow down |
Date: |
Wed, 21 Nov 2001 11:48:03 +1100 |
G'Day,
It appears that 'sort -m' slows down exponentially as the number of files
to it increases. This also appears to be why 'sort' is slow when dealing
with files that exceed the internal buffer size, that is, it is slow during
the "merging the 1Mb files back together" part of the run.
files time
1 0min 08sec
2 0min 59sec
3 2min 36sec
4 4min 58sec
5 8min 37sec
:
12 1hr 14min
[All files are the same size, each containing 1million records.]
Not having not read the 'sort' code (and as such speculating), I think I
might know what the problem is.
Each time around the driving loop of the merge it evaluates each input
files current record for its fitness to be next output record. The fittest
record is output, and that file is advanced. The problem is, that the next
time around, all current records are evaluated again, even though there is
only one new record.
The solution would to remeber the fitness of each record in a sorted order,
and to insert the new record into position, ensuring a minimum number of
comparison tests are done (insertion sort?)
Hopefully, will be able to provide example of this in the next few days.
[If responding to this email send copies to address@hidden
--
Paul Matthews
ETI Migrations
x446169
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- sort/merge exponential slow down,
paul . matthews <=