bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59275: Unexpected return value of `string-collate-lessp' on Mac


From: Eli Zaretskii
Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac
Date: Sat, 26 Nov 2022 10:06:42 +0200

> From: Ihor Radchenko <yantar92@posteo.net>
> Cc: 59275-done@debbugs.gnu.org
> Date: Sat, 26 Nov 2022 02:03:43 +0000
> 
> We concluded that a better fallback when collation is not available
> would be using downcase+string-lessp when `string-collate-lessp' is
> called with non-nil IGNORE-CASE argument.

This has caveats, see below.  I won't argue about your Org-local decision,
since I don't know enough about the intended uses of what you did, but I do
have something to say about this decision in general.  I suggest at least a
FIXME comment where you do this stuff, based on what I tell below.

> Would it be acceptable for Emacs to change the fallback behavior of
> `string-collate-lessp' to:
> 
> 1. If string collation is not available and IGNORE-CASE is nil, fallback
>    to`string-lessp';
> 2. If string collation is not available and IGNORE-CASE is non-nil,
>    use `downcase' + `string-lessp'.

'downcase' uses the buffer-local case table if such is defined for the
buffer that happens to be the current when you invoke 'downcase', and that's
another cause of inconsistency and user surprises, especially when the
strings you compare don't really "belong" to the current buffer.  Also, in
some (rarely-used) locales, downcasing has unexpected results, even with the
default case-table.  For example, downcasing "I" produces "ı", not "i" as
expected.  Did you think about these cases when making the above decision?

> I also do not think that it will be backwards-incompatible. If the call
> to `string-collate-lessp' explicitly requests ignoring case, `downcase'
> is more expected than bare `string-lessp' that _does not_ ignore case.
> 
> WDYT?

See above.  What you suggest is perhaps fine for plain-ASCII text, but not
in general, IMNSHO.

The reason for what Emacs currently does on systems that lack collation
functions is that for such systems collation rules are indeterminate, and so
inventing them by following naïve rules of plain ASCII, in particular the
case-conversion rules, is potentially very wrong.  These are general-purpose
APIs, not something concrete in specific Org contexts, and as such, these
APIs cannot "mostly work", they should work always and for every possible
use case.

And we are talking about a single system where these problems happen, which
is macOS, right?  Wouldn't it be better for "Someone" who uses macOS to just
bite the bullet and write a proper collation function, or find a free
software implementation of one, and include it in Emacs?  This is what I did
for MS-Windows at the time string-collate-lessp was added to Emacs.  Why
cannot macOS users do the same?





reply via email to

[Prev in Thread] Current Thread [Next in Thread]