bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Question: How does ncurses store and handle "wide" characters?


From: pjfarley3
Subject: Question: How does ncurses store and handle "wide" characters?
Date: Sun, 30 May 2021 14:08:52 -0400

This is just a question, not a bug report.

I've been trying to determine from examining the ncurses header how "wide"
characters are actually stored by ncurses, in pursuit of analyzing
differences between ncurses and PDCurses, because PDCurses is all that is
available on Windows systems unless one is using msys2 or cygwin.

As far as I can tell, the cchar_t type uses a 32-bit int in which the
low-order 8 bits are for the character, the next-higher 8 bits are for the
color pair number and the high-order 16 bits are for attribute flags.  When
I tracked down the wchar_t type in a linux system (Ubuntu 20.04) it seems to
be defined as an "int".

I do not understand how the wchar_t and cchar_t types are used by ncurses to
support wide characters, or the uses of the win_t type.

In the PDCursesMod fork of PDCurses in the docs/MANUAL.md text file there is
a clear description of the internal storage for characters, pasted below.
The 32-bit version sacrifices 8 attribute flag bits for a 16-bit character
storage, while the 64-bit version stores true Unicode characters in 21 bits,
supports all ncurses attribute bits and has space to support much larger
numbers of color pairs.

Is there any similar documentation for how ncurses currently stores and
handles wide characters?

Peter


Extract from
https://github.com/Bill-Gray/PDCursesMod/blob/master/docs/MANUAL.md:

Text Attributes
===============

If CHTYPE_32 is #defined,  PDCurses uses a 32-bit integer for its chtype:

    +--------------------------------------------------------------------+
    |31|30|29|28|27|26|25|24|23|22|21|20|19|18|17|16|15|14|13|..| 2| 1| 0|
    +--------------------------------------------------------------------+
          color pair        |     modifiers         |   character eg 'a'

There are 256 color pairs (8 bits), 8 bits for modifiers, and 16 bits
for character data. The modifiers are bold, underline, right-line,
left-line, italic, reverse and blink, plus the alternate character set
indicator.

   By default,  a 64-bit chtype is used :

----------------------------------------------------------------------------
---
|63|62|61|60|59|..|34|33|32|31|30|29|28|..|22|21|20|19|18|17|16|..| 3| 2| 1|
0|
----------------------------------------------------------------------------
---
         color number   |        modifiers      |         character eg 'a'

   We take five more bits for the character (thus allowing Unicode values
past 64K;  the full range of Unicode goes up to 0x10ffff,  requiring 21 bits
total),  and four more bits for attributes.  Three are currently used as
A_OVERLINE, A_DIM, and A_STRIKEOUT;  one more is reserved for future use.
On some platforms,  bits 33-40 are used to select a color pair (can run from
0 to 255). Bits 41 and 42 have been added to this to get 1024 color pairs.
On some platforms (as of 2020 May 17,  WinGUI and VT),  bits 33-52 are used,
allowing 2^20 = 1048576 color pairs.  That should be enough for anybody, and
leaves twelve bits for other uses.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]