help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Aw: Re: Putting international characters into output files


From: Markus Mützel
Subject: Aw: Re: Putting international characters into output files
Date: Mon, 6 Aug 2018 23:36:07 +0200

> Gesendet: Montag, 06. August 2018 um 17:07 Uhr
> Von: "Ian McCallion" <address@hidden>
> An: mmuetzel <address@hidden>
> Cc: address@hidden
> Betreff: Re: Putting international characters into output files
>
> On 6 August 2018 at 08:42, mmuetzel <address@hidden> wrote:
> > No worries.
> > The Pound symbol should display correctly in the command window (even on
> > Windows).
> > If I write the following in a script, select the whole line and execute
> > using F9, it shows up correctly. (It will be less tedious once Octave 5.0
> > will be available.):
> > t = "£"
> It does not work for me. If I copy your text and paste it into the
> Octave GUI Command window
> it shows as t = "". If I copy and paste it into the editor it shows correctly.
> 

Typing or pasting non-ASCII characters in the command window doesn't currently 
work on Windows. But execution from the editor (with F9) should make it display 
correctly in the results.

> > Which steps are you doing that provoke the character not being shown? Are
> > you referring to the Command Window or to the Workspace panel? Or both?
> > I am interested in this because there currently are ongoing efforts to make
> > Octave more Unicode aware across all supported platforms.
> 
> My original code, which edited an incoming file and produced an output
> file worked apart from the screen output where the £ sign was
> displayed as a black diamond containing a '?'.
> 
> I then wrote this function:
> function [] = codes ()
>   fid=fopen('codes.txt','w');
>   for i=1:4:512
>       fprintf(fid,'%d => %s %s %s %s\n',i, char(i), char(i+1),
> char(i+2), char(i+3))
>       fprintf(    '%d => %s %s %s %s\n',i, char(i), char(i+1),
> char(i+2), char(i+3))
>   end
>   fclose(fid);
> end
> 
> which has a £ sign in position 163 in the file but all characters
> above code 127 show as the black diamond in the command window
> 

Unlike the name might suggest, char can be used to hold any byte array. Not 
only strings.
Octave used UTF-8 as its internal encoding for strings. 128 (and above) doesn't 
correspond to a valid codepoint in UTF-8. You would have to use the 
corresponding double byte to see what you might expect. That's why you are 
seeing the replacement character.
Your editor probably uses a different codepage to display the file. That's why 
you are seeing the £ sign among others.

If that should be necessary you could use native2unicode to convert from any 
codepage (e.g. Windows-1252)  to UTF-8. And unicode2native to convert back.

But if you don't care about what Octave is displaying, it might be safe to 
don't mess with the encoding at all as long as all of your operations are on 
ASCII characters only (or you know how to treat characters at codepoints >127).

I don't think Octave is doing anything wrong here.

Hope this helps anyway.

Cheers
Markus



reply via email to

[Prev in Thread] Current Thread [Next in Thread]