coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: how to speed up sort for partially sorted input?


From: Kaz Kylheku (Coreutils)
Subject: Re: how to speed up sort for partially sorted input?
Date: Tue, 10 Aug 2021 22:06:47 -0700
User-agent: Roundcube Webmail/0.9.2

On 2021-08-07 17:46, Peng Yu wrote:
Hi,

Suppose that I want to sort an input by column 1 and column 2 (column
1 is of a higher priority than column 2). The input is already sorted
by column1.

Is there a way to speed up the sort (compared with not knowing column
1 is already sorted)? Thanks.

Since you know that colum 1 is sorted, it means that a sequential scan
of the data will reveal chunks that have the same colum1 value.

You just have to read and separate these chunks, and sort each one
individually by column 2.

GNU Awk has the wherewithal for this sort of thing; it has some facilities
for sorting associative arrays.

You can scan records and aggregate them while column1 is the same,
then do some sorting and output (also at the end of the file).

Good luck!




reply via email to

[Prev in Thread] Current Thread [Next in Thread]