help-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Display of decomposed characters


From: Philipp
Subject: Re: Display of decomposed characters
Date: Thu, 18 Mar 2021 15:16:42 +0100


> Am 28.02.2021 um 19:42 schrieb Eli Zaretskii <eliz@gnu.org>:
> 
> 
>>>> I guess fonts assume that applications will first try to normalize
>>>> strings to avoid issues like this?
>>> 
>>> Normalizing strings before you know whether the font has the
>>> precomposed glyphs makes no sense.
>> 
>> Why? If the font doesn’t support a precomposed character, wouldn’t
>> the rendering engine automatically fall back to a decomposed
>> representation?
> 
> No.  How can it?
> 
> The fallback is in the composition code, not in the renderer.  The
> latter just lays out the glyphs that it gets from the composition
> code.  (Assuming that when you say "rendering engine" you mean the
> part in the Emacs display code which handles layout.)

What I mean is Harfbuzz (given your comment below, apparently the more correct 
term is "shaping engine").

> 
> IOW, there's no "font doesn't support" in Emacs.  It works like this:
> 
>  . we check whether the current character should compose with the
>    following and/or preceding ones

Is my understanding right that this is the step that comes too late, i.e. after 
font selection?  Otherwise I'd assume that the answer is always "yes" if the 
current character is a combining character.

>    . if it should compose, then:
>      . pass the chunk of text that should compose to the shaping
>        engine (e.g., HarfBuzz)
>      . if the shaping engine succeeds, render the glyphs it returns
>    . otherwise render the original character "normally", i.e. without
>      consulting the shaping engine
> 
> (The above omits some secondary details in the interests of clarity.)
> The "otherwise" part is the fallback you alluded to.  As you see, we
> never ask the font, we only talk to the shaping engine.

Hmm.  If these steps all happen before font selection, then I'm wondering where 
the problem comes from.
Or do they happen after font selection?

> 
>> IOW, would normalizing strings to NFC before sending them to the rendering 
>> engine ever break anything?
> 
> Yes, it might.  Shaping engines don't usually decompose characters if
> they get codepoints of precomposed ones.
> 
> Moreover, some precomposed glyphs don't even have codepoints, so you
> cannot even ask the shaper to produce them by passing it a precomposed
> character in that case -- such a character doesn't exist.

OK, so I guess we then definitely can't precompose unconditionally.

> 
>>> What the text-shaping folks tell us is that we should pass _all_ the
>>> text through the text shaper, then the shaper will DTRT in every
>>> case.  But this would mean a thorough redesign and reimplementation of
>>> how we do that in Emacs, and that is not easy if we want to keep the
>>> current flexibility and customizability (which is why the character
>>> composition code calls out to Lisp, and that makes sending all the
>>> text that way tool expensive to be practical).
>> 
>> Would it be possible to implement a more minimal change to fix the problem 
>> at hand?
> 
> Like what?

What I'd propose would be to perform font selection after the 
"compose/no-compose" decision.

>  (And why we are discussing such an issue on the help
> list?)

I'd first wanted to check whether this is actually a bug before filing a formal 
report, but I'll do that now.

> 
>>>> Does it ever make sense to pick different fonts for a base character
>>>> and its combining characters?
>>> 
>>> If the default font doesn't support the combining accent, what else
>>> can you do?  Most fonts don't have precomposed glyphs for every
>>> arbitrary sequence of base character followed by several combining
>>> accents.  So sometimes you will have to compose the accents "by hand",
>>> and that is not really possible if they come from different fonts.
>> 
>> Which is why they shouldn’t come from different fonts. What if Emacs ignored 
>> font lookup for combining characters and always picked the font of the 
>> previous base character?
> 
> What would that produce if the font of the previous character didn't
> have a glyph for the accent?  The accent will disappear, or maybe will
> be displayed as "tofu", right?  Does that sound like a good strategy?

Can't the shaping engine produce fake compositions in that case?

> 
>>>> Wouldn't that fundamentally prevent using combining characters? IIUC
>>>> text rendering engines should be able to pick the right glyph if
>>>> that didn't happen (assuming they can perform Unicode
>>>> normalization).
>>> 
>>> Unicode normalization is only tangentially relevant here.
>> 
>> Sure, but in this case it would fix them problem AFICS.
> 
> Sorry, I no longer understand what was this about (what does "that"
> allude to here?).

'That' refers to "pick different fonts for a base character
and its combining characters".

>  That's bound to happen when a response comes more
> than a month after the original exchange.

Yes, but unfortunately answering these questions takes some time, which I don't 
always have.  I'll try to respond more timely in the future, but I can't really 
promise that.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]