help-octave
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC] Octave and internationalization


From: Alan W. Irwin
Subject: [RFC] Octave and internationalization
Date: Wed, 2 Apr 2014 09:32:16 -0700 (PDT)
User-agent: Alpine 2.02 (DEB 1266 2009-07-14)

On 2014-04-02 00:24-0700 CdeMills wrote:

Hello,

a recurrent topic on this list and #octave is how to use accentuated letters
and special symbols in Octave. I started a specific page on the wiki [1],
stating the evolution of encoding, the main norms, how it works as of today,
as well as plans for future development.

IMHO utf-8 support should be easy to achieve, it is a matter of ensuring
that every char manipulation is 8-bits clean. Locales and complete support
(isalpha, strlen, manipulating two strings with different encoding) is
another story.

Could the persons interested please complete ? We should define the needs,
their priority, and have a idea about the complexity of the implementations.

Regards

Pascal
[1] http://wiki.octave.org/International_Characters_Support

Hi Pascal:

I am not an Octave developer so I don't want to get involved with
that wiki, but I do have some comments concerning what it says now.

<quote>Octave "by accident" supports UTF-8, meaning that the vast majority of
functions for text display and graph manipulations are using 8-bits
chars, passing them unmodified to the underlying layers in charge of
rendering.</quote>

I am sure you are already aware of this since "by accident" was in
quotes, but I want to emphasize for others here that these good
results for UTF-8 are _not_ an accident.  Instead, UTF-8 was designed
to be a logical extension to 7-bit ascii which is why there are a lot
of automatic benefits to using this encoding of unicode. Furthermore,
gaining additional benefits is normally simply a matter of making the
relevant Octave code 8-bit clean. For example, the recent solution of
one such 8-bit clean issue <http://savannah.gnu.org/bugs/?41965> now
allows UTF-8 to be used in function help strings without any need for
wide characters.  This has big positive implications for help string
translation efforts as I have emphasized in another thread concerning
that Octave bug fix.

In fact, the superiority of UTF-8 to all other unicode encodings (such
as UTF-16 and UTF-32) is so clear that UTF-8 is really the only widely
supported unicode encoding on Unix, and UTF-8 is also well-supported
on Windows. So for PLplot we have explicitly decided to never support
any unicode encoding other than UTF-8 in our public API.  This
approach has been successful on all platforms and has greatly
simplified our development life.  Therefore, I would advise the Octave
developers to take that same approach as well.  If you do make that
decision it would immediately follow that your future unicode plans as
outlined in the above URL would be considerably simplified.

Alan
__________________________
Alan W. Irwin

Astronomical research affiliation with Department of Physics and Astronomy,
University of Victoria (astrowww.phys.uvic.ca).

Programming affiliations with the FreeEOS equation-of-state
implementation for stellar interiors (freeeos.sf.net); the Time
Ephemerides project (timeephem.sf.net); PLplot scientific plotting
software package (plplot.sf.net); the libLASi project
(unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
and the Linux Brochure Project (lbproject.sf.net).
__________________________

Linux-powered Science
__________________________



reply via email to

[Prev in Thread] Current Thread [Next in Thread]