bug-ncurses
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Handling ACS_DARROW and friends


From: Bryan Christ
Subject: Re: Handling ACS_DARROW and friends
Date: Mon, 1 Jan 2018 20:15:14 -0600

Since the terminfo entry for rxvt defines smacs and rmacs, libvterm tries to handle these by toggling a flag in its state machine.  This determines whether the chtype is = x or whether chtype c == NCURSES_ACS(c).  When I look at a hexdump from Finch (the program, I'm debugging with), the sequence is clear 0F E28693 {some other stuff} 0E.  The relatively primitive code inside a switch statement that handles this UTF-8 sequence is dead simple:

case 0x00E28693:    { *utf8_char = ACS_DARROW;          break;}

So with your explanation, I'm thinking that what's happening is that I expect chtype to be getting stored as NCURSES_ACS('.') but instead it's getting stored as NCURSES_ACS('v').  In other words, when running on the Linux console, I'm getting column 1, but running in other terms, I'm getting column 2.  So, would I be playing a game of whack-a-mole to just code in NCURSES_ACS('.') in this situation?

On Mon, Jan 1, 2018 at 5:57 PM, Thomas Dickey <address@hidden> wrote:
On Mon, Jan 01, 2018 at 03:20:35PM -0600, Bryan Christ wrote:
> I am the author of libvterm (formerly libRote) which is a terminal emulator
> designed specifically to output to a ncurses WINDOW.  The library attempts
> to mimic RXVT and sets $TERM accordingly.  I recently added some minimalist
> UTF-8 code which parses common UTF-8 encodings and writes ACS chars where
> it makes sense.
>
> Depending on the underlying terminal, sometimes ACS_DARROW gets rendered as
> ACS_BTEE.  I think this is somehow related to the notes on
>  NCURSES_NO_UTF8_ACS since the problem doesn't occur when the terminal is
> Linux and slightly different behavior under Screen.  On all other terms I
> tested, the output for ACS_DARROW is a bottom-tee (ACS_BTEE).  Also, I find
> it coincidental that "v" is the non-acs equivalent for ACS_DARROW and but
> happens to be the ACS char for ACS_BTEE.
>
> I have a thought or tow on solving this, but I would prefer some expert
> advice as to how best to handle this problem.  The output under Linux term
> is great and would like to have consistency on other terms.
>
> Using ncurse 6 as packaged with Ubuntu 16.04.3 LTS

hmm - there's several issues

down-arrow is part of this set of (non-VT100) characters:

/* Teletype 5410v1 symbols begin here */
#define ACS_LARROW      NCURSES_ACS(',') /* arrow pointing left */
#define ACS_RARROW      NCURSES_ACS('+') /* arrow pointing right */
#define ACS_DARROW      NCURSES_ACS('.') /* arrow pointing down */
#define ACS_UARROW      NCURSES_ACS('-') /* arrow pointing up */
#define ACS_BOARD       NCURSES_ACS('h') /* board of squares */
#define ACS_LANTERN     NCURSES_ACS('i') /* lantern symbol */
#define ACS_BLOCK       NCURSES_ACS('0') /* solid square block */

Some terminals (Linux console is one) have a mapping using smacs/rmacs/enacs
for those characters.  But others do not.  In ncurses, I check the locale
encoding and (prefer smacs/rmacs because it's more efficient), and keep in
mind terminals (such as Linux console...) which ignore smacs/rmacs in UTF-8
mode.

GNU screen ignores it also.  I added NCURSES_NO_UTF8_ACS when I got to
PuTTY, and then the "u8" feature so that (for a properly written terminal
description), ncurses would just work.  Works for me, but of course
PuTTY's developer insists on setting TERM=xterm, making for interesting
bug reports.

Back to smacs/rmacs: early on, Linux console didn't have those, but simply
mapped to codes in 128-255.  In UTF-8 mode, of course, those didn't work.
Later someone added a VT100-style smacs/rmacs, which initially was buggy,
and took several years to gain adoption.  With Ubuntu 16.x, you've got the
result of that process.

When ncurses decides that the terminal doesn't support the escape sequences
for smacs/rmacs, and it can use UTF-8 (according to the locale setting),
it uses Unicode:

        /* Teletype 5410v1 symbols */
        { ',',  { '<',  0x2190 }},      /* arrow pointing left */
        { '+',  { '>',  0x2192 }},      /* arrow pointing right */
        { '.',  { 'v',  0x2193 }},      /* arrow pointing down */
        { '-',  { '^',  0x2191 }},      /* arrow pointing up */
        { 'h',  { '#',  0x2592 }},      /* board of squares */
        { 'i',  { '#',  0x2603 }},      /* lantern symbol */
        { '0',  { '#',  0x25ae }},      /* solid square block */

The second column of that table is the closest ASCII equivalent to
the glyph, which is the head of the arrow, and happens to correspond
to the "ncurses" library table (which is from the "ncursesw" wide-character
table).

If you're using "ncurses" rather than "ncursesw", you'll get the
second column anyway, and if you're using a locale encoding which
isn't UTF-8, you'll get that as well.

Now... ncurses doesn't store the Unicode value in the cchar_t (or chtype).
It stores the A_ALTCHARSET flag and the input to the mapping ('.' for
down-arrow), and constructs the Unicode as needed when writing to the
screen, in PutAttrChar().

Quite a while ago (2004...), ncurses stored the mapped character,
but this proved to be a problem with repeated translation.  I changed
that to the current scheme to eliminate the problem:

20040320
        + modify PutAttrChar() and PUTC() macro to improve use of
          A_ALTCHARSET attribute to prevent line-drawing characters from
          being lost in situations where the locale would otherwise treat the
          raw data as nonprintable (Debian #227879).

So... perhaps you're doing the map-lookup which is feeding the non-Unicode
'v' to ncurses and it's doing its own acs_map lookup to produce the effect
you described.

--
Thomas E. Dickey <address@hidden>
https://invisible-island.net
ftp://ftp.invisible-island.net



--
Bryan
<><

reply via email to

[Prev in Thread] Current Thread [Next in Thread]