[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Sorting on compound keys?
From: |
Tim Landscheidt |
Subject: |
Re: Sorting on compound keys? |
Date: |
Fri, 10 Jun 2011 00:27:37 +0000 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) |
Mark Tilford <ralphmerridew@gmail.com> wrote:
>> sometimes I want to sort unified diffs of CSV files (sepa-
>> rated by tabs (here: \t)):
>> | +A 1\t1\tx
>> | +A 1\t2\ty
>> | +B 2\t3\tz
>> | -A 1\t1\tx
>> | -B 2\t2\ty
>> | -B 2\t3\tz
>> by the second column, then the first column, then "+" vs.
>> "-". Unfortunately, it seems that sort-regexp-fields doesn't
>> allow more than one match field as a key. sort-fields
>> doesn't work either as it requires the fields to be sur-
>> rounded by white space (no "+" vs. "-") and doesn't allow
>> white space inside the fields.
>> Is there any function in vanilla Emacs (23.1.1) that I
>> missed? I looked at pimping sort-regexp-fields, but it seems
>> to me that sort-subr would have to be rewritten from scratch
>> to achieve sorting on compound keys.
> Is there an option to do a stable sort, such as mergesort?
Eureka! Of course! All Emacs sort functions are stable, so
99 % of my use cases can be dealt with by multiple calls to
sort-regexp-fields (the only exception being sorting numeri-
cally and the like).
Unfortunately, those multiple calls can be tedious when
done interactively, so voilà:
| (defun tl-sort-regexp-fields (reverse record-regexp key-regexp beg end)
| (interactive "P\nsRegexp specifying records to sort:
| sRegexp specifying key within record: \nr")
| (if (string-match "\\`\\(?:-\\\\[1-9]\\|\\(?:-?\\\\[1-9]\\)\\{2,\\}\\)\\'"
key-regexp)
| (let
| ((i (length key-regexp)))
| (while (> i 0)
| (let ((key-reverse (and (> i 2) (= (aref key-regexp (- i 3)) ?-)))
| (key (substring key-regexp (- i 2) i)))
| (sort-regexp-fields (if reverse (not key-reverse) key-reverse)
record-regexp key beg end)
| (if key-reverse
| (setq i (- i 1)))
| (setq i (- i 2)))))
| (sort-regexp-fields reverse record-regexp key-regexp beg end)))
A key-regexp of "\2\3\1" will yield the region sorted by the
second field, then the third, then the first. The fields can
be prefixed with "-" to negate the sort order for this
field, e. g. "\2-\3\1" will sort by the second field ascend-
ingly, then the third descendingly, then the first ascend-
ingly.
With regard to performance, the region is sorted once for
every key, so it may not be suitable for larger datasets,
but up to a few thousand lines it's fast enough for me. If
someone wants to integrate this into Emacs, please go ahead.
Thanks, also to Andreas,
Tim
P. S.: Is there really no xor in elisp?