[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Surrogate pairs for addwstr?
From: |
Tim Allen |
Subject: |
Re: Surrogate pairs for addwstr? |
Date: |
Sun, 10 Oct 2021 10:04:31 +1100 |
On Sat, Oct 09, 2021 at 01:41:57PM -0400, Bill Gray wrote:
> I tried feeding Unicode surrogate pairs to ncurses, and
> nothing shows up (two blank characters are shown in xterm where
> one combined SMP character ought to.) Test code is shown below.
> When run with ncurses, the second treble shows up; the first one
> doesn't.
Surrogate pairs only combine to create a single character in UTF-16
encoded data, or on platforms (Windows, Java, JavaScript, macOS Cocoa)
that use UTF-16 as an internal representation. Code-points in the
surrogate pair range are not allowed to appear in un-encoded Unicode
data, so if they show up, at best they'll be ignored, but they might
show up as blanks or as U+FFFE � REPLACEMENT CHARACTER.
ncurses' wide mode might use the locale's encoding (UTF-8, almost
universally) or might just hard-code UTF-8 as the internal
representation, since it's generally the best choice for the kind of
data ncurses handles. The behaviour you describe is within the range of
behaviour I'd expect.
Tim.