freebangfont-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Freebangfont-devel] Ankur's bangla unicode statement (was: Update on Na


From: Deepayan Sarkar
Subject: [Freebangfont-devel] Ankur's bangla unicode statement (was: Update on National ...)
Date: Wed, 29 Oct 2003 12:36:26 -0600
User-agent: KMail/1.5.3

Since there has been no adverse reaction, let me post a (proposed) final 
version of the draft before I get swamped by assignments again. I doubt this 
will have much effect, but Indranil, please forward this (after any suggested 
amendments and a bit of polishing up) to whoever it should be forwarded to. 

(Note: I have dropped one issue based on some discussion on the address@hidden 
list, namely the one regarding the use of the dotted circle.)


<draft>

Problems (or the lack thereof) with the current unicode model for Bengali
=========================================================================

We (the Ankur Group) feel that the current unicode encoding model for Bengali 
is reasonably good, with no major changes needed (except perhaps some related 
to khanda ta encoding, see issue 1 below). However, the encoding of some 
features of the language are non-trivial, and the official recommendations of 
the unicode consortium are in a few cases outdated or non-existent. We urge 
the unicode consortium to update their webpages / documentation so that 
implementers as well as users can be pointed to an 'authentic' source of 
information.


issue 1: khanda ta
------------------

The current recommendation to encode khanda ta is 'ta + hasanta + ZWJ', which 
is clearly insufficient. For example,  

cons1 + ta + hasanta + ZWJ +  cons2 + ikaar

will be rendered wrongly (the reordered ikaar would go before and not after 
the khanda ta). A possible fix (already suggested by Paul Nelson) to the 
problem is to encode khanda ta as 'ta + hasanta + ZWJ + ZWNJ'. However, we 
believe (at least I do. Comments anyone ?) that a much more elegant and 
natural solution is the one outlined by Andy White in a recent post to the 
address@hidden list, and described in

http://www.exnet.btinternet.co.uk/uniprop/KhandaTaEncode.pdf

I realise that this would need all fonts to be modified slightly (though not 
opentype rendering implementations), but given that the current recommended 
encoding needs a change anyway, I think the gain in simplicity will be worth 
the little extra trouble.


Issue 2: ra + yaphala
---------------------

The proposal by Paul Nelson for handling ra+yafala/ya-reph also seems to be 
quite reasonable to us. This is available at

http://www.unicode.org/review/pr-9.pdf



Issue 3: ra + rikaar
--------------------

During the discussion, we came across another unusual glyph. In the Bengali 
word Nairrhit (meaning south west), there's a glyph that looks like 

Vocalic R (098B) + reph

and one might wonder whether this should be encoded as 

ra + hasanta + Vocalic R  [ 09B0 09CD 098B ]

or 

ra + Vowel sign Vocalic R (09C3) [ 09B0 09C3 ] 

The consensus seems to be that the second solution is more natural and 
consistent with the unicode model --- we would like the Unicode consortium to 
explicitly mention and endorse this solution. (Note that this wouldn't need 
any change anywhere, except to add a below base substitution for ra+rikaar in 
fonts.)


issue 4: a + yaphala
--------------------

Could be mentioned for the sake of completeness. The current recommendation 
seems adequate.


Other than those outlined above, at this moment, we are concerned about No
other issues and, in our opinion, a consensus on these would solve all
current bengali ambiguities (although if anyone thinks otherwise, we would 
definitely like to hear their reasons).

</draft>









reply via email to

[Prev in Thread] Current Thread [Next in Thread]