[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: How to get the script name symbols of a specific character?
From: |
YE Qianchuan |
Subject: |
Re: How to get the script name symbols of a specific character? |
Date: |
Mon, 11 Feb 2013 23:07:19 +0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130109 Thunderbird/17.0.2 |
On 02/11/2013 07:34 PM, Jambunathan K wrote:
Put your cursor on the box and type
C-u C-x =
In fact, it's the same as `describe-char'. This command invokes
`what-cursor-position', which invokes `describe-char' eventually.
It will give more useful pointers. The codepoint of a particular
character. The name of the character, in the example below is prefixed
by the script it comes from etc.
Cool, I didn't notice its name may be prefixed by its script. It does
make a lot sense.
However sadly, not all characters do so. For example, a CJK character
has prefix CJK.
But cjk is not a script name (though there's a script called cjk-misc)
and it should belong
to `han'.
What's worse is, some characters don't show their names at all, even if
I assign a font to it.
For example:
position: 806 of 1031 (78%), column: 1
character: 😀 (displayed as 😀) (codepoint 128512,
#o373000, #x1f600)
preferred charset: unicode (Unicode (ISO10646))
code point in charset: 0x1F600
syntax: w which means: word
category: L:Left-to-right (strong)
buffer code: #xF0 #x9F #x98 #x80
file code: #xF0 #x9F #x98 #x80 (encoded by coding system
utf-8-unix)
display: no font available
Character code properties: customize what to show
general-category: Cn (Other, Not Assigned)
decomposition: (128512) ('😀')
,----
| position: 192 of 196 (97%), column: 0
| character: ஜ (displayed as ஜ) (codepoint 2972, #o5634, #xb9c)
| preferred charset: unicode (Unicode (ISO10646))
| code point in charset: 0x0B9C
| syntax: w which means: word
| category: .:Base, L:Left-to-right (strong)
| to input: type "ja" with tamil-itrans input method
| buffer code: #xE0 #xAE #x9C
| file code: #xE0 #xAE #x9C (encoded by coding system utf-8)
| display: by this font (glyph code)
| xft:-unknown-Lohit Tamil-normal-normal-normal-*-24-*-*-*-*-0-iso10646-1
(#x44)
|
| Character code properties: customize what to show
| name: TAMIL LETTER JA
| general-category: Lo (Letter, Other)
| decomposition: (2972) ('ஜ')
|
| There are text properties here:
| fontified t
`----
Also you may want to look at this page:
http://en.wikipedia.org/wiki/Unicode_block
How can I achieve this? Do I miss something?
Thanks for your help.
Re: How to get the script name symbols of a specific character?, Stefan Monnier, 2013/02/11
Re: How to get the script name symbols of a specific character?, T.F. Torrey, 2013/02/11
Re: How to get the script name symbols of a specific character?, YE Qianchuan, 2013/02/12