bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Bugs in wide character API


From: Thomas Dickey
Subject: Re: Bugs in wide character API
Date: Mon, 23 Aug 2004 08:04:08 -0400 (EDT)

On Mon, 23 Aug 2004, Marcin 'Qrczak' Kowalczyk wrote:

> W liście z nie, 22-08-2004, godz. 18:56 -0400, Thomas Dickey napisał:
>
> > > 2. get_wch doesn't recode a printing character from the locale encoding
> > >    to Unicode if the locale is different from UTF-8 (or ISO-8859-1 when
> > >    no recoding is needed).
> >
> > I'm not sure why:  ncurses doesn't use any logic specific to UTF-8 for
> > that function.
>
> lib_get_wch.c contains this fragment:
>
>     if ((is8bits(value) && isprint(value))
>       || (SP->_legacy_coding && (value >= 160))) {
>       /*
>        * FIXME: is there a case with multibyte strings where the
>        * bytes after the first could be printable?
>        */
>       if (count != 0) {
>           ungetch(value);
>           code = ERR;
>       }
>       break;
>     }
>
> I don't fully understand this condition, but it is entered in the case
> of encodings like ISO-8859-2, which leaves value encoded in ISO-8859-2,
> later returned as wint_t.

It's concerned with a case where the locale isn't set.  See lib_set_term.c
(where SP->_legacy_coding is set).  The isprint shouldn't be a problem.

> BTW, why macros above prefer HAVE_MBTOWC to HAVE_MBRTOWC? I think it
> should try reentrant functions first.

The other dumps core on Solaris, iirc, and it's in a situation where a
configure script is unlikely to detect this.

>
> > At the moment I can't really see why the get_wstr should have used win_t*,
> > since (unlike get_wch) the result shouldn't have to represent negative
> > values.  Assuming there's no good reason, there's still portability of
> > applications - and I don't see any good reason to make programs that
> > can't port to Tru64 curses.  If the types are the same size, then an
> > (admittedly annoying) cast is all that's needed.
>
> wint_t is generally never used to represent characters in strings.
> It exists solely for the possibility of encoding WEOF as distinct from
> valid values of wchar_t.

yes, I know that.  Actually (looking in header files last night), I can
see that for some configurations one is a 16-bit and the other a 32-bit
value.  (A cast is not necessarily correct - will have to read more).

> Anyway, ncurses performs pointer arithmetic on this value cast to
> wchar_t *. So I cast it to void * to avoid a warning in either case.
>
> --
>    __("<         Marcin Kowalczyk
>    \__/       address@hidden
>     ^^     http://qrnik.knm.org.pl/~qrczak/
>
>
>
> _______________________________________________
> Bug-ncurses mailing list
> address@hidden
> http://lists.gnu.org/mailman/listinfo/bug-ncurses
>

-- 
Thomas E. Dickey
http://invisible-island.net
ftp://invisible-island.net




reply via email to

[Prev in Thread] Current Thread [Next in Thread]