Re: Re: Putting international characters into output files

On Mon, 6 Aug 2018, 23:36 "Markus Mützel", <address@hidden> wrote:

Unlike the name might suggest, char can be used to hold any byte array. Not only strings.
Octave used UTF-8 as its internal encoding for strings. 128 (and above) doesn't correspond to a valid codepoint in UTF-8. You would have to use the corresponding double byte to see what you might expect. That's why you are seeing the replacement character.
Your editor probably uses a different codepage to display the file. That's why you are seeing the £ sign among others.

If that should be necessary you could use native2unicode to convert from any codepage (e.g. Windows-1252) to UTF-8. And unicode2native to convert back.

But if you don't care about what Octave is displaying, it might be safe to don't mess with the encoding at all as long as all of your operations are on ASCII characters only (or you know how to treat characters at codepoints >127).

I don't think Octave is doing anything wrong here.

Hi Marcus,

Thanks for the explanation - really helpful. Based on that I agree Octave is acting correctly when executing my function. I'm less convinced it is acting correctly when it ignores non-ascii characters typed into the console though as I feel it should be possible to deduce the correct utf-8 code even if it then displays as unknown. However it is not a problem for me now I understand what is going on.

Cheers... Ian

From:	Ian McCallion
Subject:	Re: Re: Putting international characters into output files
Date:	Tue, 7 Aug 2018 19:55:43 +0200