bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#69968: Case-folding of Mathematical Alphanumeric Symbols


From: Juri Linkov
Subject: bug#69968: Case-folding of Mathematical Alphanumeric Symbols
Date: Sun, 24 Mar 2024 19:09:10 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/30.0.50 (x86_64-pc-linux-gnu)

>> I wonder why case-folding is not supported for letters from
>> the Unicode block "Mathematical Alphanumeric Symbols":
>> https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols
>
> These are not letters, they are symbols.  And letter-case is not
> defined for symbols.

π˜‹π˜° 𝘺𝘰𝘢 𝘳𝘦𝘒𝘭𝘭𝘺 𝘡𝘩π˜ͺ𝘯𝘬 𝘡𝘩π˜ͺ𝘴 𝘡𝘦𝘹𝘡 π˜ͺ𝘴 𝘯𝘰𝘡 𝘸𝘳π˜ͺ𝘡𝘡𝘦𝘯 𝘸π˜ͺ𝘡𝘩 π™‘π™šπ™©π™©π™šπ™§π™¨?

>> Is it because the Unicode standard doesn't provide information
>> about their case-folding?  And indeed they are missing from
>> https://unicode.org/Public/UNIDATA/CaseFolding.txt
>
> Unicode doesn't consider them letters.

Ок, if Unicode doesn't consider them letters,
let's stick to the Unicode standard.

>> But OTOH, I can't find the file CaseFolding.txt in admin/unidata.
>> This means Emacs doesn't use this file?
>
> We don't.  We use the case-conversion information in UnicodeData.txt,
> as it tells us everything we need to know.

Thanks, I didn't remember that case-conversion is in UnicodeData.txt.
I checked admin/unidata/UnicodeData.txt and indeed there is
no case-conversion for Mathematical Alphanumeric Symbols.

>> Then should we add more case-folding information explicitly
>> for this Unicode block?
>
> What is the rationale for doing so?  It's against Unicode, so we need
> to have a good reason, as this will have to be maintained by hand, and
> also because some users might be surprised.

I don't think that some users might be surprised because
when they don't need to change case, they just don't use
case-changing functions.  But when they expect that case
should be changed, then indeed they will be surprised
that case is not changed.

>> Case-folding is already supported for some characters from other
>> Unicode blocks such e.g. FULLWIDTH LATIN CAPITAL LETTERs,
>> CIRCLED LATIN CAPITAL LETTERs, etc.
>
> That's because UnicodeData.txt defines their letter-case conversions.

Ok, then it's very strange that the Unicode standard doesn't define
letter-case conversions for other letters.  But what can we do.

>> But e.g. PARENTHESIZED LATIN CAPITAL LETTERs are missing too.
>> What is worse is that in Emacs β’œ doesn't have even a word syntax
>> like its counterpart πŸ„.
>
> I think the fact that πŸ„ has the word syntax might be a mistake.  These
> are both symbols, so why would we want them to have the word syntax?

Because they look like letters with diacritics.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]