bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#59275: Unexpected return value of `string-collate-lessp' on Mac


From: Ihor Radchenko
Subject: bug#59275: Unexpected return value of `string-collate-lessp' on Mac
Date: Sat, 26 Nov 2022 08:47:13 +0000

Eli Zaretskii <eliz@gnu.org> writes:

>> We concluded that a better fallback when collation is not available
>> would be using downcase+string-lessp when `string-collate-lessp' is
>> called with non-nil IGNORE-CASE argument.
>
> This has caveats, see below.  I won't argue about your Org-local decision,
> since I don't know enough about the intended uses of what you did, but I do
> have something to say about this decision in general.  I suggest at least a
> FIXME comment where you do this stuff, based on what I tell below.

Thanks for the information!

>> Would it be acceptable for Emacs to change the fallback behavior of
>> `string-collate-lessp' to:
>> 
>> 1. If string collation is not available and IGNORE-CASE is nil, fallback
>>    to`string-lessp';
>> 2. If string collation is not available and IGNORE-CASE is non-nil,
>>    use `downcase' + `string-lessp'.
>
> 'downcase' uses the buffer-local case table if such is defined for the
> buffer that happens to be the current when you invoke 'downcase', and that's
> another cause of inconsistency and user surprises, especially when the
> strings you compare don't really "belong" to the current buffer.

Interesting. Is there any reason why this is not mentioned in the
docstring for `downcase'?

I now see 4.10 The Case Table section of the manual, and it looks like
case tables should be set mostly automatically (by Emacs?) according to
the language environment. Are details about this process documented
anywhere? Are these case conversion tables independent of glibc?

> Also, in
> some (rarely-used) locales, downcasing has unexpected results, even with the
> default case-table.  For example, downcasing "I" produces "ı", not "i" as
> expected.  Did you think about these cases when making the above decision?

I did not. However, I recall reading somewhere that it is possible work
around this kind of issues by calling case conversion several times:
upcase -> downcase -> upcase -> downcase.

I did not. But now, after you reminded me about this caveat, I do recall
https://nullprogram.com/blog/2014/06/13/ that mentioned something
similar about caveats with composition. Just mentioning it for your
reference. (I am not sure if the caveats discussed have been raised on
Emacs devel).

>> I also do not think that it will be backwards-incompatible. If the call
>> to `string-collate-lessp' explicitly requests ignoring case, `downcase'
>> is more expected than bare `string-lessp' that _does not_ ignore case.
>> 
>> WDYT?
>
> See above.  What you suggest is perhaps fine for plain-ASCII text, but not
> in general, IMNSHO.
>
> The reason for what Emacs currently does on systems that lack collation
> functions is that for such systems collation rules are indeterminate, and so
> inventing them by following naïve rules of plain ASCII, in particular the
> case-conversion rules, is potentially very wrong.  These are general-purpose
> APIs, not something concrete in specific Org contexts, and as such, these
> APIs cannot "mostly work", they should work always and for every possible
> use case.

I feel that I miss something. Don't Emacs provide unicode case
conversion tables? Why plain ASCII rules?

> And we are talking about a single system where these problems happen, which
> is macOS, right?  Wouldn't it be better for "Someone" who uses macOS to just
> bite the bullet and write a proper collation function, or find a free
> software implementation of one, and include it in Emacs?  This is what I did
> for MS-Windows at the time string-collate-lessp was added to Emacs.  Why
> cannot macOS users do the same?

It would be. But how can we ask for this? etc/TODO? Or maybe re-open
this bug report?

-- 
Ihor Radchenko // yantar92,
Org mode contributor,
Learn more about Org mode at <https://orgmode.org/>.
Support Org development at <https://liberapay.com/org-mode>,
or support my work at <https://liberapay.com/yantar92>





reply via email to

[Prev in Thread] Current Thread [Next in Thread]