emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: sorting in C


From: Andrew Cohen
Subject: Re: sorting in C
Date: Fri, 04 Mar 2022 08:13:33 +0800
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)

So I think I am nearing the end of the sorting saga. Porting the cpython
code for TIMSORT was easier than I expected, and with patient help from
Eli and especially Mattias E. (who explained to me exactly how to manage
the dynamic storage allocation) I have what should be close to the
finished form.

Here are some (final?) benchmarks. The three columns are the time (in
microseconds) for sorting lists of length 10K using the current list
sort, the current vector sort (by first converting the list to a vector)
and the new timsort. (Note that there is very little overhead for the
conversion so these just reflect the sorting algorithms and their
implementation). I have used 10 different lists that are often used for
comparing sorting algorithms: random, reverse sorted, sorted, sorted
with 3 random binary swaps, sorted with 10 random elements at the end,
sorted with a random 1% of the elements replaced with random numbers,
constant, evil (that is (logxor i 1), which swaps every pair), sorted
blocks of 100, sorted blocks of 10. The final column is the percentage
improvement over the current list sort.

|                                     | oldlist | oldvec |  tim |       |
|-------------------------------------+---------+--------+------+------|
| (make-random-list 10000)            |    2842 |   2146 | 2233 |  27% |
| (nreverse (make-sorted-list 10000)) |    1431 |   1004 |  174 | 722% |
| (make-sorted-list 10000)            |    1333 |    924 |  170 | 684% |
| (make-swapped-list 10000 3)         |    1512 |   1055 |  179 | 745% |
| (make-plus-list 10000)              |    1346 |    915 |  174 | 674% |
| (make-onepercent-list 10000)        |    1792 |   1308 |  274 | 554% |
| (make-constant-list 10000)          |    1328 |    917 |  169 | 686% |
| (make-evil-list 10000)              |    1398 |    969 |  609 | 130% |
| (make-block-list 10000 100)         |    2244 |   1651 | 1304 |  72% |
| (make-block-list 10000 10)          |    2641 |   1990 | 2034 |  30% |

As you can see the worst case is purely random for which the new
algorithm is still more than 25% faster than the current one (albeit 4%
slower than for vectors in this case), and short blocks is a close
second. Everything else is clearly much faster, with almost everything
else being factors of 6 to 8 times faster.

(I have been running with timsort as a full replacement for sorting for
a few days now, and while I can't say it has transformed my life, it is
certainly a noticeable improvement in some circumstances).

I'll post again in awhile with final questions and some suggestions for
pushing it to git. Please let me know if I should post the code here
(its about 1K lines including lots of comments.)

-- 
Andrew Cohen




reply via email to

[Prev in Thread] Current Thread [Next in Thread]