[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-libunistring] Hangul Jamo vowels and trailing consonants should
From: |
Bruno Haible |
Subject: |
Re: [bug-libunistring] Hangul Jamo vowels and trailing consonants should probably be 0 width |
Date: |
Thu, 30 Dec 2021 01:26:45 +0100 |
I wrote:
> - GNOME vte based terminal emulators are probably 50% today,
> - konsole comes second,
So I tested how the attached file renders in gnome-terminal and
konsole.
- In gnome-terminal the precomposed and decomposed lines render
identically.
- In konsole it doesn't, but in kate it does, therefore konsole
will probably support it right as well, within a few years.
Luis Javier Merino wrote:
> Yes. wcwidth() interfaces lack context. wcswidth()-style interfaces
> are better in that regard.
But if we start to modify wcswidth(), mbswidth(), and all various
functions that evaluate the displayed length of a string to consider
context-dependent widths, things are going to get very complex.
> > 2) People argue about the use of these Hangul Jamo characters when
> > they form a complete Hangul syllable, and that in this case the
> > total width should be 2, and therefore 2 = 2 + medial + final the
> > medial and final parts should have width 0.
> >
> > But in this case people would be using a precomposed Hangul syllable.
>
> The Mac OS X filesystem stores filenames as NFD, which would separate
> syllables into component Jamos. See:
>
> https://github.com/neovim/neovim/issues/4476
Indeed, this shows that the problem affects many users.
> > What I am more concerned about: When you look at the code charts
> > https://www.unicode.org/charts/PDF/U1100.pdf
> > https://www.unicode.org/charts/PDF/UD7B0.pdf
> > you see that there are glyphs.
> > - In which circumstances are these characters used individually?
> > Maybe in a text book for Korean children?
> > - How are they supposed to be rendered in these situations? Surely
> > as glyphs of width 2, no?
>
> To render as separate components, there are several options:
>
> - Use the non-conjoining forms from the Hangul Compatibility Jamo:
> U+3130–U+318F block.
Good point. So, we can assume that texts in which the conjoining
behaviour is undesired will use these characters U+3130–U+318F.
The only remaining argument for having Hangul Jamo vowels and trailing
consonants be marked as having width 2 is Unicode's EastAsianWidth.txt file.
But the corresponding explanation <https://www.unicode.org/reports/tr11/>
makes it clear that the purpose of this file is to guarantee compatibility
with traditional Japanese rendering. But such rendering did not know about
the Hangul conjoining behaviour; therefore what the EastAsianWidth.txt
says about these characters is irrelevant.
I'm therefore doing the requested change.
https://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=commitdiff;h=8026587b94e4274f3406a36bc89348a24ea86b6a
Bruno
Hangul.utf-8
Description: Text document