[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: how to speed up sort for partially sorted input?
From: |
Mingye Wang |
Subject: |
Re: how to speed up sort for partially sorted input? |
Date: |
Wed, 25 Aug 2021 11:27:32 +0800 |
Hi Peng,
(This following message was mistakenly posted as a "reply", instead of
a "reply all". Apologies for the duplication.)
The current sort algorithm used by coreutils sort(1) is a mergesort,
which unfortunately does not pick up cues on partially-sorted input. A
relatively well-known algorithm that picks up partial sort order is
timsort -- the larger GNU project does make use of it in the form of
GNU Octave, but I doubt any performance benefit will come from running
an Octave script instead of sort. Checking the list of languages
making use of timsort[1] might lead you to a good choice though --
people reimplement Unix utilities in other languages a lot.
[1]: https://en.wikipedia.org/wiki/Timsort
It might be interesting to propose a change to timsort in coreutils,
but I am not sure what the trade-offs are exactly compared to the
current Knuth merge-sort.
Cheers,
Mingye Wang (Artoria2e5)