[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: RFE: Please allow unicode ID chars in identifiers
From: |
Chet Ramey |
Subject: |
Re: RFE: Please allow unicode ID chars in identifiers |
Date: |
Tue, 13 Jun 2017 19:59:22 -0400 |
User-agent: |
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 |
On 6/13/17 4:44 PM, tetsujin@scope-eye.net wrote:
>
>
> Please excuse the top-posting, this mail client isn't very good...
>
> To some extent, tying the shell script language to the locale is
> unavoidable. However, one of the points I was trying to make is that, in
> principle, at least, this shouldn't be the case. If a script is written in
> a particular character encoding (and uses characters from that encoding in
> its function names or parameter names, for instance) it should still run
> correctly even if it's run in a different locale, just as a compiled
> program should be able to run in a locale other than the one in which its
> source code was authored.
This isn't a good comparison. Even a compiled program that calls one of
the ctype.h functions is dependent on the locale in which it's run. A
script, since it's text and interpreted, has the same dependency, to an
even greater extent. If C source code contains character strings that are
encoded in the author's locale, you're going to get indeterminate results
if you try to display them in an environment using a different locale.
You can mitigate this somewhat by using the mechanisms available to
control the locale: for a C program it's setlocale(), and for a script
it's the LC_ and LANG variables.
> For that to work, basically the character encoding used to interpret the
> script should be (potentially) distinct from the one used to interact with
> the rest of the system.
What "rest of the system"? What "matters of I/O"?
>
> ...But that gets complicated: the shell would need to interpret the script
> in its locale of origin, but still respect the locale for other matters of
> I/O. But since data in the shell intermingles with programming constructs
> in the shell (Variables get passed by name, command and function names get
> stored in (and invoked from) shell variables, variable values and "here"
> docs come from the script, etc.) it gets into questions like, do we have to
> track character encoding for each variable in the script? When do we
> transcode between encodings? And what happens when a transcoding isn't
> possible?
>
> So maybe the whole thing is just reaching too far... But that's how I'd
> want to approach it: I'd want people to be able to use their character set
> in their scripts, but I'd want it to work in a way that a script, once
> written, can work regardless of the active locale.
I assume that by this you mean the user's locale. You can still force a
different one.
--
``The lyf so short, the craft so long to lerne.'' - Chaucer
``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, UTech, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/
- Re: RFE: Please allow unicode ID chars in identifiers, (continued)
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, tetsujin, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Greg Wooledge, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, tetsujin, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, L A Walsh, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, George, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Greg Wooledge, 2017/06/14
- Re: RFE: Please allow unicode ID chars in identifiers,
Chet Ramey <=
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13
- Re: RFE: Please allow unicode ID chars in identifiers, Greg Wooledge, 2017/06/02
- Re: RFE: Please allow unicode ID chars in identifiers, L A Walsh, 2017/06/02
- Re: RFE: Please allow unicode ID chars in identifiers, L A Walsh, 2017/06/03
- Re: RFE: Please allow unicode ID chars in identifiers, John McKown, 2017/06/03
- Re: RFE: Please allow unicode ID chars in identifiers, L A Walsh, 2017/06/03
- Re: RFE: Please allow unicode ID chars in identifiers, John McKown, 2017/06/04
- Re: RFE: Please allow unicode ID chars in identifiers, Peter & Kelly Passchier, 2017/06/03
- Re: RFE: Please allow unicode ID chars in identifiers, Greg Wooledge, 2017/06/05
- Re: RFE: Please allow unicode ID chars in identifiers, Chet Ramey, 2017/06/13