[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: is correct to use waddnstr for utf8 strings?
From: |
Thomas Dickey |
Subject: |
Re: is correct to use waddnstr for utf8 strings? |
Date: |
Sun, 5 May 2019 14:00:48 -0400 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Sun, May 05, 2019 at 07:12:18PM +0200, Pavel Stehule wrote:
> so 4. 5. 2019 v 23:05 odesílatel Thomas Dickey <address@hidden> napsal:
>
> > On Thu, May 02, 2019 at 11:48:42AM +0200, Pavel Stehule wrote:
> > > čt 2. 5. 2019 v 11:27 odesílatel Thomas Dickey <address@hidden> napsal:
> > >
> > > > On Wed, May 01, 2019 at 04:48:01PM +0200, Pavel Stehule wrote:
> > > > > Hi
> > > > >
> > > > > st 1. 5. 2019 v 14:39 odesílatel Pavel Stehule <address@hidden>
> > > > > napsal:
> > > > >
> > > > > > Hi
> > > > > >
> > > > > > I try to fix some issues with utf8 chars on Solaris.
> > > > ...
> > > > > I wrote small test. Looks so Solaris doesn't work correctly with
> > > > > 3bytes
> > > > > utf8 chars. Two bytes are ok.
> > > > >
> > > > > Is it ncurses issue or Solaris issue?
> > > >
> > > > I suspect Solaris (though I might find time on the weekend to
> > > > investigate
> > > > this).
> > > >
> > > > There's an additional factor that the package is probably rather old.
> > > >
> > >
> > > I see same behave with integrated Solaris ncurses and with OpenCSW
> > > ncurses.
> >
> > I see (now). In my 2015-fix, I checked the result of wcwidth in the
> > screen-update functions, but not the functions which add characters
> > to the windows. For whatever reason, that worked for my test-programs,
> > but this example does not.
> >
> > The fix will be in today's patch...
> >
> > (just for the record, this workaround only applies to the WACS_xxx codes
> > that ncurses knows about -- there are a _lot_ of ambiguous-width characters
> > in Unicode)
> >
>
> Maybe I seen similar error on code
>
> x25BA
>
> http://www.codetable.net/hex/25ba
>
> When I draw this char, then solaris ncurses eats one char
Something like it.
In ncurses I'm only special-casing the WACS_xxx characters,
but it's part of a larger problem introduced by Unicode itself.
The EastAsianWidth.txt file from Unicode has these lines:
The "A" is "ambiguous".
25B2..25B3;A # So [2] BLACK UP-POINTING TRIANGLE..WHITE UP-POINTING
TRIANGLE
25B4..25B5;N # So [2] BLACK UP-POINTING SMALL TRIANGLE..WHITE
UP-POINTING SMALL TRIANGLE
The "N" is "neutral", which (oversimplifying) is like "ambiguous".
But different. In either case, wcwidth does not know about it.
The problem is that Unicode has defined a whole set of characters to
have different behavior, depending on who is using them.
Read and enjoy:
http://www.unicode.org/reports/tr11/#Ambiguous
> > >
> > > OpenCSW package is relative fresh
> >
> > relatively :-)
> >
> > > ncurses version: 6.1, patch: 20180127
> > > ncurses with wide char support
> > > ncurses widechar num: 0
> > >
> > > buildin ncursis is older
> > >
> > > ncurses version: 6.0, patch: 20170708
> > > ncurses with wide char support
> > > ncurses widechar num: 0
> >
> > ... still, better than MacOS :-)
> >
> > > > One problem with Solaris, which I worked-around a few years ago,
> > > > is that all of the line-drawing characters are "ambiguous width",
> > > > and that Solaris' locale assigns all of those to double-width.
> > > >
> > > > See
> > > >
> > > > https://invisible-island.net/ncurses/NEWS.html#t20151219
> > > >
> > > > and subsequent mention of Solaris.
> >
>
> Thank you
>
> Pavel
>
>
>
> > --
> > Thomas E. Dickey <address@hidden>
> > https://invisible-island.net
> > ftp://ftp.invisible-island.net
> >
--
Thomas E. Dickey <address@hidden>
https://invisible-island.net
ftp://ftp.invisible-island.net
signature.asc
Description: Digital signature