bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: is correct to use waddnstr for utf8 strings?


From: Thomas Dickey
Subject: Re: is correct to use waddnstr for utf8 strings?
Date: Sun, 5 May 2019 14:00:48 -0400
User-agent: Mutt/1.5.23 (2014-03-12)

On Sun, May 05, 2019 at 07:12:18PM +0200, Pavel Stehule wrote:
> so 4. 5. 2019 v 23:05 odesílatel Thomas Dickey <address@hidden> napsal:
> 
> > On Thu, May 02, 2019 at 11:48:42AM +0200, Pavel Stehule wrote:
> > > čt 2. 5. 2019 v 11:27 odesílatel Thomas Dickey <address@hidden> napsal:
> > >
> > > > On Wed, May 01, 2019 at 04:48:01PM +0200, Pavel Stehule wrote:
> > > > > Hi
> > > > >
> > > > > st 1. 5. 2019 v 14:39 odesílatel Pavel Stehule <address@hidden>
> > > > > napsal:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > I try to fix some issues with utf8 chars on Solaris.
> > > > ...
> > > > > I wrote small test. Looks so Solaris doesn't work correctly with 
> > > > > 3bytes
> > > > > utf8 chars. Two bytes are ok.
> > > > >
> > > > > Is it ncurses issue or Solaris issue?
> > > >
> > > > I suspect Solaris (though I might find time on the weekend to 
> > > > investigate
> > > > this).
> > > >
> > > > There's an additional factor that the package is probably rather old.
> > > >
> > >
> > > I see same behave with integrated Solaris ncurses and with OpenCSW 
> > > ncurses.
> >
> > I see (now).  In my 2015-fix, I checked the result of wcwidth in the
> > screen-update functions, but not the functions which add characters
> > to the windows.  For whatever reason, that worked for my test-programs,
> > but this example does not.
> >
> > The fix will be in today's patch...
> >
> > (just for the record, this workaround only applies to the WACS_xxx codes
> > that ncurses knows about -- there are a _lot_ of ambiguous-width characters
> > in Unicode)
> >
> 
> Maybe I seen similar error on code
> 
> x25BA
> 
> http://www.codetable.net/hex/25ba
> 
> When I draw this char, then solaris ncurses eats one char

Something like it.
In ncurses I'm only special-casing the WACS_xxx characters,
but it's part of a larger problem introduced by Unicode itself.

The EastAsianWidth.txt file from Unicode has these lines:

The "A" is "ambiguous".

25B2..25B3;A     # So     [2] BLACK UP-POINTING TRIANGLE..WHITE UP-POINTING 
TRIANGLE
25B4..25B5;N     # So     [2] BLACK UP-POINTING SMALL TRIANGLE..WHITE 
UP-POINTING SMALL TRIANGLE

The "N" is "neutral", which (oversimplifying) is like "ambiguous".
But different.  In either case, wcwidth does not know about it.

The problem is that Unicode has defined a whole set of characters to
have different behavior, depending on who is using them.

Read and enjoy:

http://www.unicode.org/reports/tr11/#Ambiguous
 
> > >
> > > OpenCSW package is relative fresh
> >
> > relatively :-)
> >
> > > ncurses version: 6.1, patch: 20180127
> > > ncurses with wide char support
> > > ncurses widechar num: 0
> > >
> > > buildin ncursis is older
> > >
> > > ncurses version: 6.0, patch: 20170708
> > > ncurses with wide char support
> > > ncurses widechar num: 0
> >
> > ... still, better than MacOS :-)
> >
> > > > One problem with Solaris, which I worked-around a few years ago,
> > > > is that all of the line-drawing characters are "ambiguous width",
> > > > and that Solaris' locale assigns all of those to double-width.
> > > >
> > > > See
> > > >
> > > > https://invisible-island.net/ncurses/NEWS.html#t20151219
> > > >
> > > > and subsequent mention of Solaris.
> >
> 
> Thank you
> 
> Pavel
> 
> 
> 
> > --
> > Thomas E. Dickey <address@hidden>
> > https://invisible-island.net
> > ftp://ftp.invisible-island.net
> >

-- 
Thomas E. Dickey <address@hidden>
https://invisible-island.net
ftp://ftp.invisible-island.net

Attachment: signature.asc
Description: Digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]