bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59275: Unexpected return value of `string-collate-lessp' on Mac


From: Maxim Nikulin
Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac
Date: Sun, 27 Nov 2022 21:00:50 +0700
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

On 26/11/2022 16:22, Eli Zaretskii wrote:
From: Ihor Radchenko Date: Sat, 26 Nov 2022 08:47:13 +0000

'downcase' uses the buffer-local case table if such is defined for the
buffer that happens to be the current when you invoke 'downcase', and that's
another cause of inconsistency and user surprises, especially when the
strings you compare don't really "belong" to the current buffer.

`downcase' is already used in Org for case-insensitive sorting. I am unsure if it appeared earlier than `string-collate-lessp' was introduced. Buffer-local conversion table is not a problem when table rows, list items (text formatting object, not elisp structure), or tags local to the current file are sorted. However when agenda is built from several files current buffer should not affect entries order.

Concerning Org, my point is that caseless sorting should be uniform. Currently different functions use distinct approaches and it is more severe inconsistency.

https://nullprogram.com/blog/2014/06/13/ that mentioned something
similar about caveats with composition.

I don't see there anything about sorting or collation.  What did I miss?

Does not composed/decomposed representation affect comparison result?

Emacs-devel thread mentioned earlier in this bug contains a link describing enough issues with string comparison:

https://stackoverflow.com/questions/319426/how-do-i-do-a-case-insensitive-string-comparison

And we are talking about a single system where these problems happen, which
is macOS, right?  Wouldn't it be better for "Someone" who uses macOS to just
bite the bullet and write a proper collation function, or find a free
software implementation of one, and include it in Emacs?

My impression was that clang should eventually get better locales support. If so, I am in doubts concerning macOS-specific implementation. I have no a macOS machine, so I may be wrong in my assumption concerning locale implementation there. However Emacs may benefit from its own implementation of collation (based on built-in Unicode character database) used on (almost) all OSes. It will allow using of several locales in parallel without switching of libc locale that is not thread-safe.

I consider `downcase' as a kind of workaround (ignore case for poors) that allows graceful degradation in comparison to `string-lessp'. From my point of view e.g. case transformation rule for Turkish I is a minor issue in comparison to complete disregarding of IGNORE-CASE argument at least when results are presented to users.

My argument against `downcase' in `string-collate-lessp' is that it may add noticeable performance penalty.

Interestingly `compare-strings' uses upcase conversion when the IGNORE-CASE argument is true. I believed that some implementations (unrelated to Emacs) may have problems with e.g. ß and considered downcase as a safer option.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]