emacs-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: composed characters question and suggestions for quail-cyrillic-*


From: Juri Linkov
Subject: Re: composed characters question and suggestions for quail-cyrillic-*
Date: Tue, 08 Jul 2008 00:42:09 +0300
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/23.0.60 (x86_64-pc-linux-gnu)

> JL> 1. It uses the acute accent to put the grave accent above letters,
> JL> e.g. ("'a" ?à) ("'o" ?ò).  A correct way to implement this is to use the
> JL> acute accent to put the acute accent above letters, and to use the grave
> JL> accent to put the grave accent above letters, as all Latin input methods
> JL> do, e.g. ("'a" ?á) ("'o" ?ó) ("`a" ?à) ("`o" ?ò).
>
> You are right.  But please note that AFAIK in Cyrillic it's rare to find
> acute accents, so the idea was "accent the next letter" and the ' key is
> much more convenient on modern keyboards.  For Cyrillic in particular,
> it may make sense to use ' as the accent prefix or accept it in addition
> to `.  If you still think only ` should be used, I'll commit a patch
> immediately.

Instead of the grave accent `, most Cyrillic languages (including Bulgarian,
Russian, Ukrainian) use the acute accent ' to mark the stressed vowel.
Please see http://en.wikipedia.org/wiki/Acute_accent#Stress for more
information.

> JL> 2. It uses accented Latin letters à, ò that is inappropriate for
> JL> Cyrillic texts.  The only valid way (as I understand according to
> JL> Unicode specifications) is to use combining characters.
>
> I think I mentioned this in an earlier post.  Combining characters look
> inconsistent and sometimes take up two lines of text in Emacs, so I
> thought it would be acceptable to use the accented Latin letters.  If
> not, I'm OK with replacing them with the combining versions.  Please
> note I'm not an expert on this topic, so I greatly appreciate your
> recommendations.

If combining characters take two lines, then it is a bug.  I remember
that rendering of combining characters was correct before the Unicode
merge.  If it was possible to do right before the merge, maybe it will be
possible to fix this in current code using the same logic?

> JL> 3. It turns "'" into a prefix key, but it is used to input "ь" according
> JL> to the rule ("'" ?ь).
>
> Would it be possible to move ь under the ' prefix?  As I mentioned the '
> key is very convenient and ь is not a frequently-needed letter.  It
> actually works fine for me as it is (unless I need to type something
> like ьо, which is rare), but I see the problem.

In Bulgarian it is rare, but in Russian and Ukrainian it is very
frequently used letter ;-)

> JL> 4. «»“„‘‚§№ is too limited set of necessary characters and this set is
> JL> not specific to `cyrillic-translit'.  Different styles of quotation
> JL> marks are required by typographic rules in other several languages and
> JL> scripts besides Cyrillic, and these rules also require using other
> JL> symbols like dashes of different lengths, nbsp, 1/2, 1/4, subscripts,
> JL> copyright, currency signs, and many more.
>
> In the specific cases I know (I only write in Bulgarian frequently), the
> characters I added are most needed.  If you or others want to add more
> characters, go ahead or tell me what needs to be added.

Thanks, the characters you added are very needed.  Other needed characters
to add are at least ”’–—•…

> JL> So instead of copying the same rules to all input method a better
> JL> way is to create a separate common input method with all these
> JL> special symbols and to share it with language specific input
> JL> methods.
>
> My suggestion was essentially to build a prefix tree for Slavic
> languages, since they share enough typographic rules, and to insert it
> into every specific input method.  Using a secondary input method works
> better so I hope it can happen (if Kenichi Handa's patch is OK).

And in another message you wrote:

> If this can go into the trunk, I'll be glad to use it (my changes will
> then be unnecessary).  The only caution is that universal sequences are
> not always intuitive; a good example is that I put "/ab" for paragraph
> because that makes sense in Bulgarian ("абзац" means paragraph,
> pronounced "abzatz").  So it would be nice to have a universal input
> method plus custom rules at the intermediate level (e.g. cyrillic-*).

It might be funny but in Russian § is named as a "paragraph sign",
so your mnemonics don't work here.  And "абзац" is used for a different
character, actually the pilcrow.  Please compare:

http://ru.wikipedia.org/wiki/%C2%B6
http://ru.wikipedia.org/wiki/%D0%97%D0%BD%D0%B0%D0%BA_%D0%BF%D0%B0%D1%80%D0%B0%D0%B3%D1%80%D0%B0%D1%84%D0%B0

-- 
Juri Linkov
http://www.jurta.org/emacs/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]