bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64420: string-width of … is 2 in CJK environments


From: SUNG TAE KIM
Subject: bug#64420: string-width of … is 2 in CJK environments
Date: Fri, 14 Jul 2023 13:45:58 +0900

Hi, I'm the issue(https://github.com/company-mode/company-mode/issues/1388) reporter of emacs company package. I've been suggested to comment by the project owner of the company package on the matter of character-width-table. So, here's my thoughts.

There's many characters marked as A(ambiguous) width in the file  (https://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt) which is one of the Unicode 15.0.0 Character Database. The characters inside the general punctuation block (U+2000..U+206F) are marked as either N(Narrow) or A(Ambiguous) width and the ellipsis character(U+2026) is marked as A. Also there's a suggestion for rendering the ambiguous width unicode character for Non-East Asian character in the Unicode 15.0.0 East Asian Width Technical Report(http://www.unicode.org/reports/tr11/).

Quotes from the TR.

> 5 Recommendations
>
> When processing or displaying data
>
>  • Ambiguous characters behave like wide or narrow characters depending on the context (language tag, script identification, associated font, source of data, or explicit markup; all can provide the context). If the context cannot be established reliably, they should be treated as narrow characters by default.

My understanding of the report about the treatment of the ambiguous width is that the context is paramount and the recommendation of the default is narrow for the non-East Asian characters.
 
How about in practice? I've tested the rendering of a few ambiguous width characters on some OSes - terminal.

macOS Mojave - builtin, kitty, iterm2
  Rendered as narrow character regardless of locale/font setting.

Windows 11 - old and new terminal
  Rendered as narrow character regardless of locale/font setting.

Ubuntu 20 - gnome-terminal
  User can set the width of ambiguous characters either narrow(default) or wide through compatibility option.

I'm surprised gnome-terminal has this option. However, it seems incomplete because when I try to delete an ambiguous width character rendered as a wide one, the terminal masses up its cursor position whereas deleting a wide character works fine.

So, I think the proper default width value of the ambiguous width characters is narrow and there must be options for setting width for those ambiguous width characters, but such change of default value might cause breakage in the emacs packages which rely on the CJK language environment.

All in all, I think providing comprehensive options to change the width of those ambiguous width characters will be desirable.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]