bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk: Wrong behavior in binary mode


From: Eli Zaretskii
Subject: Re: gawk: Wrong behavior in binary mode
Date: Mon, 15 Dec 2008 23:48:17 +0200

> Date: Mon, 15 Dec 2008 23:26:33 +0200
> From: Aharon Robbins <address@hidden>
> Cc: address@hidden, address@hidden
> 
>       export LC_ALL=C
> 
> A little-known fact is that it's possible to change the value of single
> environment variables for just a single command run.  This is done by
> 
>       LC_ALL=C command arg ....
> 
> To demonstrate in our case:
> 
>       $ export LC_ALL=en_US.utf8
>       $ gawk --version
>       GNU Awk 3.1.6
>       Copyright (C) 1989, 1991-2007 Free Software Foundation.
>       ...............
>       $ gawk 'BEGIN { print length("\x81\x82\x83\x84") }'
>       0
>       $ LC_ALL=C gawk 'BEGIN { print length("\x81\x82\x83\x84") }'
>       4
>       $ echo $LC_ALL
>       en_US.utf8
> 
> LC_ALL=C could perhaps be documented more fully as a way to provide the
> "hands off my bytes" feature, but otherwise I don't see a big need for
> Yet Another Magic Variable or for a new command line option.
> 
> If I'm missing something really big and obvious, please let me know.

I think you are missing the fact that LC_ALL=C has broad effects other
than just disabling multibyte characters.  For example, it also causes
Gawk to speak US English when displaying messages, and use US format
for dates and currency.  What do I do if I want my error messages in
Hebrew, but need to work with raw binary data that is not a character
string?




reply via email to

[Prev in Thread] Current Thread [Next in Thread]