octave-maintainers
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: char type in Octave


From: mmuetzel
Subject: Re: char type in Octave
Date: Sun, 27 May 2018 10:17:26 -0700 (MST)

> As a programmer I would be very surprised--even upset--if I called a
function like toupper with a 5-byte string, and it came back as a 10-byte
string.

Unfortunately, for the "toupper" and "tolower" functions we don't have much
choice but use what Unicode has defined. Consider e.g. the uppercase
character U+0130 "İ" which is represented in UTF-8 by two bytes (C4 B0). Its
lowercase version is U+0069 "i" which is only one byte long in UTF-8. (Same
but vice versa for lowercase U+0131 "ı" and its uppercase U+0049 "I").
If such a case occurs (size of result wouldn't match size of input), I chose
to emit a warning and fall back to the non-Unicode aware standard library
functions (see bug #53873).

For "islower" and "isupper" (and other similar functions), I'll try to stick
to that Principle of Least Surprise.

Markus




--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html



reply via email to

[Prev in Thread] Current Thread [Next in Thread]