bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#64420: string-width of … is 2 in CJK environments


From: Yuan Fu
Subject: bug#64420: string-width of … is 2 in CJK environments
Date: Fri, 11 Aug 2023 11:07:26 -0700


> On Aug 10, 2023, at 10:53 PM, Eli Zaretskii <eliz@gnu.org> wrote:
> 
>> From: Yuan Fu <casouri@gmail.com>
>> Date: Thu, 10 Aug 2023 14:58:37 -0700
>> Cc: Dmitry Gutov <dmitry@gutov.dev>,
>> SUNG TAE KIM <itaemu@gmail.com>,
>> 64420@debbugs.gnu.org
>> 
>>> OK, this is now installed on master.  We have a new user option named
>>> cjk-ambiguous-chars-are-wide; its default is t, but if set to nil, the
>>> characters proclaimed by Unicode as "ambiguous" will have char-width
>>> of 1, not 2.  Note that this option should be set either via 'setopt'
>>> or the Customize interface, not via 'setq'.
>>> 
>>> Let me know how well this works for you.
>> 
>> Thanks! I can’t tell you how well it works tho since I don’t use company :-)
> 
> You don't need company to see if this works well for you.  Just use
> string-width or even char-width with some problematic characters (you
> can find the list of them in characters.el, search for "ambiguous"),
> and compare the results when this new variable is nil and non-nil.
> I'm interested to know how many people need the variable to be non-nil
> (its default) to have the width match the fonts they use in Emacs,
> both in GUI and in TTY frames, since there's the claim that no one
> needs those characters be considered full-width nowadays.  If that
> claim is correct, we should consider changing the default value of
> this variable in Emacs 30.

On my machine, all the ambiguous characters have width of 1, even with the 
default value of cjk-ambiguous-chars-are-wide (I use utf8_en locale). That’s 
expected.

I tried printing all the ambiguous characters, I attached a screenshot of them 
(the first line is a line of CJK characters for reference). (Scrrenshot-1.png, 
screenshot-2.png)

On terminal, I saw an interesting option, “Ambiguous characters are 
double-width” (terminal-setting.png), which is the same as 
cjk-ambiguous-chars-are-wide. If I turn it on all the ambiguous characters are 
indeed displayed in double-width. (terminal-narrow.png, terminal-wide.png)

On GUI display, the later-half of the ambiguous characters are definitely wider 
than one char, but they aren’t quite 2 chars wide either. But I guess it 
doesn’t matter too much since one should use pixel size on GUI anyway.

On terminal, at least iterm2 displays ambiguous characters as single-width by 
default, (I assume) regardless of locale. And it displays a warning when you 
try to turn the “Ambiguous characters are double-width” option [1].

Yuan

[1] "You probably don't want to turn this on. It will confuse interactive 
programs. You might want it if you work mostly with East Asian text combined 
with legacy or mathematical character sets. Are you sure you want this?"

PNG image

PNG image

PNG image

PNG image

PNG image

Attachment: amiguous-width.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]