UTF-8 multi-byte characters are not displayed properly on Windows consol

bug-ncurses

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

UTF-8 multi-byte characters are not displayed properly on Windows consol

From:	LIU Hao
Subject:	UTF-8 multi-byte characters are not displayed properly on Windows consoles
Date:	Thu, 12 Jan 2023 15:30:20 +0800
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2

Hello folks,

I'm mingw-w64 developer and MSYS2 contributor, and I maintain a GNU nano port to Windows [1]. Firstof all, thank you for the great work!

Since Windows 10, the Windows console has gained UTF-8 support, which however has to be enabledexplicitly in system control panel. After UTF-8 support has been enabled and the UTF-8 code page hasbeen set up with the `chcp 65001` command, all standard C ctype functions can work on UTF-8 strings.

However, when GNU nano attempts to display a UTF-8 string, it is taken bytewise and becomesgibberish. I have created this testcase, for example:


   ```
   #include <ncursesw/ncurses.h>

   int
   main(void)
     {
       initscr();
       addstr("»·");  // hex: C2 BB C2 B7
       refresh();
       getch();
     }
   ```

The commented string literal contains two characters as four bytes. On Linux it is displayedproperly, but on a Windows UTF-8 console I get `Â»Â·`. How should I fix it?



[1] https://github.com/lhmouse/nano-win


--
Best regards,
LIU Hao

OpenPGP_signature
Description: OpenPGP digital signature

[Prev in Thread]

Current Thread

[Next in Thread]

re: UTF-8 multi-byte characters are not displayed properly on Windows consoles, Thomas Dickey, 2023/01/12
- UTF-8 multi-byte characters are not displayed properly on Windows consoles, LIU Hao, 2023/01/12
  - Re: UTF-8 multi-byte characters are not displayed properly on Windows consoles, Benno Schulenberg, 2023/01/12
  - Re: UTF-8 multi-byte characters are not displayed properly on Windows consoles, LIU Hao, 2023/01/12
    - Re: UTF-8 multi-byte characters are not displayed properly on Windows consoles, Thomas Dickey, 2023/01/14
- UTF-8 multi-byte characters are not displayed properly on Windows consoles, LIU Hao <=

Prev by Date: UTF-8 multi-byte characters are not displayed properly on Windows consoles
Next by Date: Re: UTF-8 multi-byte characters are not displayed properly on Windows consoles
Previous by thread: Re: UTF-8 multi-byte characters are not displayed properly on Windows consoles
Next by thread: ANN: ncurses-6.4-20230114
Index(es):
- Date
- Thread