help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: manipulating (capitalize, lower case) unicode bold and italic charac


From: Dan Hitt
Subject: Re: manipulating (capitalize, lower case) unicode bold and italic characters
Date: Mon, 8 Jul 2019 15:32:27 -0700

On Mon, Jul 8, 2019 at 12:30 PM Eli Zaretskii <eliz@gnu.org> wrote:

> > From: Dan Hitt <dan.hitt@gmail.com>
> > Date: Mon, 8 Jul 2019 11:50:07 -0700
> > Cc: help-gnu-emacs@gnu.org
> >
> > If you enter a mathematical italic small w (0x1D464) and a mathematical
> > italic capital w (0x1D44A) and do 'describe-char' for each, the small one
> > has the Lowercase general category, while the capital has the Uppercase
> > general category.  I did not know about the concept of 'case pair' in
> > unicode, so i guess it is possible that even though emacs knows one is
> > Lowercase and one is Uppercase, it is possible that it does not know that
> > they are in a pair.
>
> They are not a letter-case pair because Unicode doesn't say they are.
> The fact that a character has general category lowercase doesn't yet
> imply that it has a defined upper-case variant, these are two separate
> attributes.  Emacs defines its case pairs according to what it finds
> in the Unicode Character Database.
>
> Of course, you can teach Emacs about the letter-case pairs yourself,
> like this:
>
>   (let ((tbl (standard-case-table)))
>     (set-downcase-syntax ?𝑊 ?𝑤 tbl)
>     (set-upcase-syntax ?𝑤 ?𝑊 tbl))
>

It looks like the set-upcase-syntax function takes the same argument order
as set-downcase-syntax, so it would be
   (set-upcase-syntax ?𝑊 ?𝑤 tbl)
but otherwise this works perfectly, so thanks very much.


>
> > (How would i find out, from emacs?)
>
> By looking at the table returned by standard-case-table, for example:
>
>   (aref (standard-case-table) ?A)
>     => 97
>
> but
>
>   (aref (standard-case-table) ?𝑊)
>     => nil
>

> > The commands downcase-region and upcase-region do not work on them
>
> They won't work on letters that have no case-pairs.
>
> Once again: I do NOT recommend using the characters from the
> Mathematical Alphanumeric Symbols block for writing English text,
> that's not their purpose.
>

Well, i'm sympathetic to that view, and i think i can understand the
motivation.  For example, a piece of software might scan a file and decide
that anything in the MAS block should be parsed into a 'formula' and maybe
build up an index of such formulas.   (Although, even in this case being
able to upcase and downcase easily is useful, as statements like  '𝑤 ∈ 𝑊'
are common in mathematics.)  If i just bold-face some English text it would
confuse any such software.  So i understand there's a powerful argument to
not use the MAS block for formatting.

Thus i'm very interested in any alternatives that offer comparable
advantages to the MAS block (e.g., no markup lying around anywhere, can be
used in comments and as variables in code, persistent, immune to font-lock,
etc).

Thanks again for your help, code, and explanations!  :)

dan


reply via email to

[Prev in Thread] Current Thread [Next in Thread]