help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How to get the script name symbols of a specific character?


From: YE Qianchuan
Subject: Re: How to get the script name symbols of a specific character?
Date: Mon, 11 Feb 2013 23:07:19 +0800
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130109 Thunderbird/17.0.2

On 02/11/2013 07:34 PM, Jambunathan K wrote:
Put your cursor on the box and type
         C-u C-x =
In fact, it's the same as `describe-char'. This command invokes
`what-cursor-position', which invokes `describe-char' eventually.

It will give more useful pointers.  The codepoint of a particular
character.  The name of the character, in the example below is prefixed
by the script it comes from etc.
Cool, I didn't notice its name may be prefixed by its script. It does make a lot sense.

However sadly, not all characters do so. For example, a CJK character has prefix CJK. But cjk is not a script name (though there's a script called cjk-misc) and it should belong
to `han'.

What's worse is, some characters don't show their names at all, even if I assign a font to it.

For example:
             position: 806 of 1031 (78%), column: 1
character: 😀 (displayed as 😀) (codepoint 128512, #o373000, #x1f600)
    preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F600
               syntax: w     which means: word
             category: L:Left-to-right (strong)
          buffer code: #xF0 #x9F #x98 #x80
file code: #xF0 #x9F #x98 #x80 (encoded by coding system utf-8-unix)
              display: no font available

Character code properties: customize what to show
  general-category: Cn (Other, Not Assigned)
  decomposition: (128512) ('😀')

,----
|              position: 192 of 196 (97%), column: 0
|             character: ஜ (displayed as ஜ) (codepoint 2972, #o5634, #xb9c)
|     preferred charset: unicode (Unicode (ISO10646))
| code point in charset: 0x0B9C
|                syntax: w      which means: word
|              category: .:Base, L:Left-to-right (strong)
|              to input: type "ja" with tamil-itrans input method
|           buffer code: #xE0 #xAE #x9C
|             file code: #xE0 #xAE #x9C (encoded by coding system utf-8)
|               display: by this font (glyph code)
|     xft:-unknown-Lohit Tamil-normal-normal-normal-*-24-*-*-*-*-0-iso10646-1 
(#x44)
|
| Character code properties: customize what to show
|   name: TAMIL LETTER JA
|   general-category: Lo (Letter, Other)
|   decomposition: (2972) ('ஜ')
|
| There are text properties here:
|   fontified            t
`----

Also you may want to look at this page:
         http://en.wikipedia.org/wiki/Unicode_block

How can I achieve this? Do I miss something?
Thanks for your help.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]