[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bugs in wide character API
From: |
Marcin 'Qrczak' Kowalczyk |
Subject: |
Re: Bugs in wide character API |
Date: |
Mon, 23 Aug 2004 13:32:07 +0200 |
W liście z nie, 22-08-2004, godz. 18:56 -0400, Thomas Dickey napisał:
> > 2. get_wch doesn't recode a printing character from the locale encoding
> > to Unicode if the locale is different from UTF-8 (or ISO-8859-1 when
> > no recoding is needed).
>
> I'm not sure why: ncurses doesn't use any logic specific to UTF-8 for
> that function.
lib_get_wch.c contains this fragment:
if ((is8bits(value) && isprint(value))
|| (SP->_legacy_coding && (value >= 160))) {
/*
* FIXME: is there a case with multibyte strings where the
* bytes after the first could be printable?
*/
if (count != 0) {
ungetch(value);
code = ERR;
}
break;
}
I don't fully understand this condition, but it is entered in the case
of encodings like ISO-8859-2, which leaves value encoded in ISO-8859-2,
later returned as wint_t.
BTW, why macros above prefer HAVE_MBTOWC to HAVE_MBRTOWC? I think it
should try reentrant functions first.
> At the moment I can't really see why the get_wstr should have used win_t*,
> since (unlike get_wch) the result shouldn't have to represent negative
> values. Assuming there's no good reason, there's still portability of
> applications - and I don't see any good reason to make programs that
> can't port to Tru64 curses. If the types are the same size, then an
> (admittedly annoying) cast is all that's needed.
wint_t is generally never used to represent characters in strings.
It exists solely for the possibility of encoding WEOF as distinct from
valid values of wchar_t.
Anyway, ncurses performs pointer arithmetic on this value cast to
wchar_t *. So I cast it to void * to avoid a warning in either case.
--
__("< Marcin Kowalczyk
\__/ address@hidden
^^ http://qrnik.knm.org.pl/~qrczak/