Re: char type in Octave

octave-maintainers

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: char type in Octave

From:	mmuetzel
Subject:	Re: char type in Octave
Date:	Sun, 27 May 2018 10:17:26 -0700 (MST)

> As a programmer I would be very surprised--even upset--if I called a
function like toupper with a 5-byte string, and it came back as a 10-byte
string.

Unfortunately, for the "toupper" and "tolower" functions we don't have much
choice but use what Unicode has defined. Consider e.g. the uppercase
character U+0130 "İ" which is represented in UTF-8 by two bytes (C4 B0). Its
lowercase version is U+0069 "i" which is only one byte long in UTF-8. (Same
but vice versa for lowercase U+0131 "ı" and its uppercase U+0049 "I").
If such a case occurs (size of result wouldn't match size of input), I chose
to emit a warning and fall back to the non-Unicode aware standard library
functions (see bug #53873).

For "islower" and "isupper" (and other similar functions), I'll try to stick
to that Principle of Least Surprise.

Markus




--
Sent from: http://octave.1599824.n4.nabble.com/Octave-Maintainers-f1638794.html

[Prev in Thread]

Current Thread

[Next in Thread]

Re: char type in Octave, Rik, 2018/05/17
- Re: char type in Octave, Michael D Godfrey, 2018/05/20
  - Re: char type in Octave, mmuetzel, 2018/05/24
- Re: char type in Octave, Rik, 2018/05/24
  - Re: char type in Octave, mmuetzel <=

Prev by Date: Re: statictics package
Next by Date: failed build after 922a93fc73ec & 39cf8145405f
Previous by thread: Re: char type in Octave
Next by thread: HELP
Index(es):
- Date
- Thread