[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#69968: Case-folding of Mathematical Alphanumeric Symbols
From: |
Juri Linkov |
Subject: |
bug#69968: Case-folding of Mathematical Alphanumeric Symbols |
Date: |
Sun, 24 Mar 2024 19:09:10 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/30.0.50 (x86_64-pc-linux-gnu) |
>> I wonder why case-folding is not supported for letters from
>> the Unicode block "Mathematical Alphanumeric Symbols":
>> https://en.wikipedia.org/wiki/Mathematical_Alphanumeric_Symbols
>
> These are not letters, they are symbols. And letter-case is not
> defined for symbols.
ππ° πΊπ°πΆ π³π¦π’πππΊ π΅π©πͺπ―π¬ π΅π©πͺπ΄ π΅π¦πΉπ΅ πͺπ΄ π―π°π΅ πΈπ³πͺπ΅π΅π¦π― πΈπͺπ΅π© π‘ππ©π©ππ§π¨?
>> Is it because the Unicode standard doesn't provide information
>> about their case-folding? And indeed they are missing from
>> https://unicode.org/Public/UNIDATA/CaseFolding.txt
>
> Unicode doesn't consider them letters.
ΠΠΊ, if Unicode doesn't consider them letters,
let's stick to the Unicode standard.
>> But OTOH, I can't find the file CaseFolding.txt in admin/unidata.
>> This means Emacs doesn't use this file?
>
> We don't. We use the case-conversion information in UnicodeData.txt,
> as it tells us everything we need to know.
Thanks, I didn't remember that case-conversion is in UnicodeData.txt.
I checked admin/unidata/UnicodeData.txt and indeed there is
no case-conversion for Mathematical Alphanumeric Symbols.
>> Then should we add more case-folding information explicitly
>> for this Unicode block?
>
> What is the rationale for doing so? It's against Unicode, so we need
> to have a good reason, as this will have to be maintained by hand, and
> also because some users might be surprised.
I don't think that some users might be surprised because
when they don't need to change case, they just don't use
case-changing functions. But when they expect that case
should be changed, then indeed they will be surprised
that case is not changed.
>> Case-folding is already supported for some characters from other
>> Unicode blocks such e.g. FULLWIDTH LATIN CAPITAL LETTERs,
>> CIRCLED LATIN CAPITAL LETTERs, etc.
>
> That's because UnicodeData.txt defines their letter-case conversions.
Ok, then it's very strange that the Unicode standard doesn't define
letter-case conversions for other letters. But what can we do.
>> But e.g. PARENTHESIZED LATIN CAPITAL LETTERs are missing too.
>> What is worse is that in Emacs β doesn't have even a word syntax
>> like its counterpart π.
>
> I think the fact that π has the word syntax might be a mistake. These
> are both symbols, so why would we want them to have the word syntax?
Because they look like letters with diacritics.