octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unicode support in io Forge package


From: Andrew Janke
Subject: Unicode support in io Forge package
Date: Fri, 18 Oct 2019 22:04:46 -0700
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:60.0) Gecko/20100101 Thunderbird/60.9.0

Hi, Octave and io maintainers,

I'm confused by the Unicode support in the io package. In particular, the functions unicode2utf8 and utf82unicode, and the "encode_utf" options in some of the ods/xls read/write functions.

What is the encoding that utf82unicode/unicode2utf8 are calling "unicode" here? It looks like it's doing a single-byte encoding, treating each byte as an unsigned int 0-255, and treating those 0-255 values directly as Unicode code point values. That's not any of the standard Unicode encodings. (But I think it is exactly the same as Latin-1/ISO 8859-1.)

As I understand it, since about Octave 4.4, Octave's internal encoding (that is, how it interprets Octave char arrays) is either UTF-8 or an opaque array of bytes; it's never in the "system code page" or some other locale-specific encoding.

Is this UTF-8 support in io still relevant/correct? Maybe it should be deprecated or renamed/removed? Since Octave now supports UTF-8, I think you'd want to just leave UTF-8 text as is in all cases.

Cheers,
Andrew



reply via email to

[Prev in Thread] Current Thread [Next in Thread]