groff
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Groff] unicode support - where to compose?


From: Bruno Haible
Subject: Re: [Groff] unicode support - where to compose?
Date: Wed, 22 Feb 2006 13:58:00 +0100
User-agent: KMail/1.5

Hello Werner,

> it's probably something which should be done later.

Well, I'm on it now because I feel more comfortable dealing with the CJK
width and other smaller problems once the logic of decomposition and
combined characters is done right.

> Currently, groff only recognizes a very limited set of
> ligatures (fi, ff, etc.)

Ah, good example! This is already an precedent where groff combines adjacent
input nodes. When the user doesn't want the ligature, he can use \& as
a separator between the two, right? So I imagine that noone will
object if troff combines
             x\[u0302]\[u0301]
into
             \[u0078_0302_0301]

> > In the first case I would put the composition into troff.
>
> OK.  With other words, it won't be handled yet.
>
> > In the second case into preconv (i.e. preconv would translate
> > <U+0078><U+0302><U+0301> to \[x u0302 u0301] but would leave alone
> > x\[u0302]\[u0301]).
>
> This would be perfect.

Hmm? Why do you qualify the second approach as "perfect", when the
other one is more in line with the mechanics how ligatures work?

I just wish to know which of the two is preferrable.

> If I understand you correctly, your approach
> will be table-driven, this is, a combining character following a base
> character will automatically be converted to the \[xxx yyy ...] form,
> right?

Yes, sure. The input stream of Unicode characters already tells us, through
the UnicodeData table, which characters are combining and decorate the
preceding base character.

Bruno





reply via email to

[Prev in Thread] Current Thread [Next in Thread]