Re: gawk: Wrong behavior in binary mode

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk: Wrong behavior in binary mode

From:	Manuel Collado
Subject:	Re: gawk: Wrong behavior in binary mode
Date:	Tue, 16 Dec 2008 10:27:27 +0100
User-agent:	Thunderbird 2.0.0.14 (Windows/20080421)

John Cowan escribió:

Eli Zaretskii scripsit:

I think you are missing the fact that LC_ALL=C has broad effects other
than just disabling multibyte characters.  For example, it also causes
Gawk to speak US English when displaying messages, and use US format
for dates and currency.  What do I do if I want my error messages in
Hebrew, but need to work with raw binary data that is not a character
string?


Quite so.  There should be some way to specify the encoding of Gawk's
input and output files independent of the locale (IMHO, encoding the
character encoding into the local identifier was just a botch.)

I strongly agree! In the worldwide environment of nowadaystext-processing utilities should be able to cope with files fromdifferent sources with different encodings, and combine them in a singlerun. This implies having independent encodings for:


- each source file
- each input data file
- each output data file

SGML/XML utilities already do that. For AWK a possible approach could be:

- Use a fixed implementation-chosen encoding for internal processing(covering UNICODE)- On-the-fly convert each source or input data to the internal encodingbefore processing.

- On-the-fly convert output data to the external encoding before printing.

This approach requires a method for specifying file encodings. Examples:
-  source files: explicit @encoding directive, use locale just as default.

- input and output data: use the value of a predefined ENCODING variableat open time, or the locale as default.


Is it OK to discuss this topic in this forum?
--
Manuel Collado - http://lml.ls.fi.upm.es/~mcollado

[Prev in Thread]

Current Thread

[Next in Thread]

gawk: Wrong behavior in binary mode, Carlos G., 2008/12/09
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/10
  - Re: gawk: Wrong behavior in binary mode, Eli Zaretskii, 2008/12/11
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/11
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/15
  - Re: gawk: Wrong behavior in binary mode, Eli Zaretskii, 2008/12/15
    - Re: gawk: Wrong behavior in binary mode, John Cowan, 2008/12/16
    - Message not available
    - Re: gawk: Wrong behavior in binary mode, Manuel Collado <=
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/17
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/17

Prev by Date: Re: New built-in variable
Next by Date: gnu time - ru_maxrss value
Previous by thread: Re: gawk: Wrong behavior in binary mode
Next by thread: Re: gawk: Wrong behavior in binary mode
Index(es):
- Date
- Thread