Re: Handling ACS_DARROW and friends

On Mon, Jan 1, 2018 at 5:57 PM, Thomas Dickey <address@hidden> wrote:

On Mon, Jan 01, 2018 at 03:20:35PM -0600, Bryan Christ wrote:
> I am the author of libvterm (formerly libRote) which is a terminal emulator
> designed specifically to output to a ncurses WINDOW. The library attempts
> to mimic RXVT and sets $TERM accordingly. I recently added some minimalist
> UTF-8 code which parses common UTF-8 encodings and writes ACS chars where
> it makes sense.
>
> Depending on the underlying terminal, sometimes ACS_DARROW gets rendered as
> ACS_BTEE. I think this is somehow related to the notes on
> NCURSES_NO_UTF8_ACS since the problem doesn't occur when the terminal is
> Linux and slightly different behavior under Screen. On all other terms I
> tested, the output for ACS_DARROW is a bottom-tee (ACS_BTEE). Also, I find
> it coincidental that "v" is the non-acs equivalent for ACS_DARROW and but
> happens to be the ACS char for ACS_BTEE.
>
> I have a thought or tow on solving this, but I would prefer some expert
> advice as to how best to handle this problem. The output under Linux term
> is great and would like to have consistency on other terms.
>
> Using ncurse 6 as packaged with Ubuntu 16.04.3 LTS

hmm - there's several issues

down-arrow is part of this set of (non-VT100) characters:

/* Teletype 5410v1 symbols begin here */
#define ACS_LARROW NCURSES_ACS(',') /* arrow pointing left */
#define ACS_RARROW NCURSES_ACS('+') /* arrow pointing right */
#define ACS_DARROW NCURSES_ACS('.') /* arrow pointing down */
#define ACS_UARROW NCURSES_ACS('-') /* arrow pointing up */
#define ACS_BOARD NCURSES_ACS('h') /* board of squares */
#define ACS_LANTERN NCURSES_ACS('i') /* lantern symbol */
#define ACS_BLOCK NCURSES_ACS('0') /* solid square block */

Some terminals (Linux console is one) have a mapping using smacs/rmacs/enacs
for those characters. But others do not. In ncurses, I check the locale
encoding and (prefer smacs/rmacs because it's more efficient), and keep in
mind terminals (such as Linux console...) which ignore smacs/rmacs in UTF-8
mode.

GNU screen ignores it also. I added NCURSES_NO_UTF8_ACS when I got to
PuTTY, and then the "u8" feature so that (for a properly written terminal
description), ncurses would just work. Works for me, but of course
PuTTY's developer insists on setting TERM=xterm, making for interesting
bug reports.

Back to smacs/rmacs: early on, Linux console didn't have those, but simply
mapped to codes in 128-255. In UTF-8 mode, of course, those didn't work.
Later someone added a VT100-style smacs/rmacs, which initially was buggy,
and took several years to gain adoption. With Ubuntu 16.x, you've got the
result of that process.

When ncurses decides that the terminal doesn't support the escape sequences
for smacs/rmacs, and it can use UTF-8 (according to the locale setting),
it uses Unicode:

/* Teletype 5410v1 symbols */
{ ',', { '<', 0x2190 }}, /* arrow pointing left */
{ '+', { '>', 0x2192 }}, /* arrow pointing right */
{ '.', { 'v', 0x2193 }}, /* arrow pointing down */
{ '-', { '^', 0x2191 }}, /* arrow pointing up */
{ 'h', { '#', 0x2592 }}, /* board of squares */
{ 'i', { '#', 0x2603 }}, /* lantern symbol */
{ '0', { '#', 0x25ae }}, /* solid square block */

The second column of that table is the closest ASCII equivalent to
the glyph, which is the head of the arrow, and happens to correspond
to the "ncurses" library table (which is from the "ncursesw" wide-character
table).

If you're using "ncurses" rather than "ncursesw", you'll get the
second column anyway, and if you're using a locale encoding which
isn't UTF-8, you'll get that as well.

Now... ncurses doesn't store the Unicode value in the cchar_t (or chtype).
It stores the A_ALTCHARSET flag and the input to the mapping ('.' for
down-arrow), and constructs the Unicode as needed when writing to the
screen, in PutAttrChar().

Quite a while ago (2004...), ncurses stored the mapped character,
but this proved to be a problem with repeated translation. I changed
that to the current scheme to eliminate the problem:

20040320
+ modify PutAttrChar() and PUTC() macro to improve use of
A_ALTCHARSET attribute to prevent line-drawing characters from
being lost in situations where the locale would otherwise treat the
raw data as nonprintable (Debian #227879).

So... perhaps you're doing the map-lookup which is feeding the non-Unicode
'v' to ncurses and it's doing its own acs_map lookup to produce the effect
you described.

--
Thomas E. Dickey <address@hidden>
https://invisible-island.net
ftp://ftp.invisible-island.net

Bryan
<><

From:	Bryan Christ
Subject:	Re: Handling ACS_DARROW and friends
Date:	Mon, 1 Jan 2018 20:15:14 -0600