[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Question: How does ncurses store and handle "wide" characters?
From: |
Thomas Dickey |
Subject: |
Re: Question: How does ncurses store and handle "wide" characters? |
Date: |
Mon, 31 May 2021 09:45:14 -0400 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Sun, May 30, 2021 at 02:08:52PM -0400, pjfarley3@earthlink.net wrote:
> This is just a question, not a bug report.
>
> I've been trying to determine from examining the ncurses header how "wide"
> characters are actually stored by ncurses, in pursuit of analyzing
> differences between ncurses and PDCurses, because PDCurses is all that is
> available on Windows systems unless one is using msys2 or cygwin.
>
> As far as I can tell, the cchar_t type uses a 32-bit int in which the
> low-order 8 bits are for the character, the next-higher 8 bits are for the
no - cchar_t stores an array of wchar_t values.
There's an array because some characters are designed to combine
(e.g., overprint or otherwise modify) the base character (the first
item in the array).
In cchar_t, attributes and colors are stored in different fields.
chtype stores an 8-bit character, combined with attributes and colors.
(it's in the header file).
> color pair number and the high-order 16 bits are for attribute flags. When
> I tracked down the wchar_t type in a linux system (Ubuntu 20.04) it seems to
> be defined as an "int".
actually an unsigned value. wint_t is a signed value more/less equivalent,
but allows for a negative value which represents an error.
> I do not understand how the wchar_t and cchar_t types are used by ncurses to
> support wide characters, or the uses of the win_t type.
wint_t and wchar_t are standard types defined outside ncurses.
ncurses' header file has fallback definitions for those to handle old systems.
> In the PDCursesMod fork of PDCurses in the docs/MANUAL.md text file there is
> a clear description of the internal storage for characters, pasted below.
> The 32-bit version sacrifices 8 attribute flag bits for a 16-bit character
> storage, while the 64-bit version stores true Unicode characters in 21 bits,
> supports all ncurses attribute bits and has space to support much larger
> numbers of color pairs.
>
> Is there any similar documentation for how ncurses currently stores and
> handles wide characters?
I haven't found it necessary to show the bit layout, since the header file
is commented well enough :-)
> Peter
>
>
> Extract from
> https://github.com/Bill-Gray/PDCursesMod/blob/master/docs/MANUAL.md:
>
> Text Attributes
> ===============
>
> If CHTYPE_32 is #defined, PDCurses uses a 32-bit integer for its chtype:
>
> +--------------------------------------------------------------------+
> |31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|15|14|13|..| 2| 1| 0|
> +--------------------------------------------------------------------+
> color pair | modifiers | character eg 'a'
>
> There are 256 color pairs (8 bits), 8 bits for modifiers, and 16 bits
> for character data. The modifiers are bold, underline, right-line,
> left-line, italic, reverse and blink, plus the alternate character set
> indicator.
>
> By default, a 64-bit chtype is used :
...which doesn't allow for combining-characters :-)
> ----------------------------------------------------------------------------
> ---
> |63|62|61|60|59|..|34|33|32|31|30|29|28|..|22|21|20|19|18|17|16|..| 3| 2| 1|
> 0|
> ----------------------------------------------------------------------------
> ---
> color number | modifiers | character eg 'a'
>
> We take five more bits for the character (thus allowing Unicode values
> past 64K; the full range of Unicode goes up to 0x10ffff, requiring 21 bits
> total), and four more bits for attributes. Three are currently used as
> A_OVERLINE, A_DIM, and A_STRIKEOUT; one more is reserved for future use.
> On some platforms, bits 33-40 are used to select a color pair (can run from
> 0 to 255). Bits 41 and 42 have been added to this to get 1024 color pairs.
> On some platforms (as of 2020 May 17, WinGUI and VT), bits 33-52 are used,
> allowing 2^20 = 1048576 color pairs. That should be enough for anybody, and
> leaves twelve bits for other uses.
Go take a look at the header file, and see where the bits go.
Changing that would require an ABI change, which is far more work
than you might imagine, with little benefit.
--
Thomas E. Dickey <dickey@invisible-island.net>
https://invisible-island.net
ftp://ftp.invisible-island.net
signature.asc
Description: PGP signature