Re: gawk: Wrong behavior in binary mode

bug-gnu-utils

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk: Wrong behavior in binary mode

From:	Eli Zaretskii
Subject:	Re: gawk: Wrong behavior in binary mode
Date:	Thu, 11 Dec 2008 18:06:36 -0500

> Date: Thu, 11 Dec 2008 05:39:23 +0200
> From: Aharon Robbins <address@hidden>
> Cc: address@hidden
> 
> Greetings. Re this:
> 
> > Date: Mon, 8 Dec 2008 23:27:51 -0200
> > From: "Carlos G." <address@hidden>
> > To: address@hidden
> > Subject: gawk: Wrong behavior in binary mode
> >
> > Hi... I think this is a bug.
> > When working with gawk in binary mode, the length() and index() built-ins
> > fail with character codes greater than 127(0x7f). For example:
> >
> > ....
> 
> First, thank you very much for the bug report.
> 
> Second, it's not a BINMODE problem; rather it is a problem with locales;
> the same behavior shows up under Linux which ignores BINMODE.

I actually think that Carlos is right: if the user says she wants the
bytes treated as bytes, Gawk should not try to treat them as multibyte
character strings.

I think the patch you posted in a followup is only partially correct:
it will only work if the stream of bytes is not a valid multibyte
string.  But what if by chance it is a valid string?  Solving this as
you did gives unpredictable results, from the point of view of a user
who does not necessarily know everything about valid and invalid
multibyte strings.

So I think there should be a way to tell Gawk "hands off my bytes!"
BINMODE could be just that way (in which case Linux should not ignore
it), or you can introduce a new variable.

Btw, we've been through these issues in Emacs when Emacs 20 introduced
multi-lingual support, and Emacs now has a special way of treating raw
bytes that don't represent multibyte (or otherwise encoded) text.

[Prev in Thread]

Current Thread

[Next in Thread]

gawk: Wrong behavior in binary mode, Carlos G., 2008/12/09
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/10
  - Re: gawk: Wrong behavior in binary mode, Eli Zaretskii <=
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/11
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/15
  - Re: gawk: Wrong behavior in binary mode, Eli Zaretskii, 2008/12/15
    - Re: gawk: Wrong behavior in binary mode, John Cowan, 2008/12/16
    - Message not available
    - Re: gawk: Wrong behavior in binary mode, Manuel Collado, 2008/12/16
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/17
- Re: gawk: Wrong behavior in binary mode, Aharon Robbins, 2008/12/17

Prev by Date: Re: gawk: Wrong behavior in binary mode
Next by Date: Translation of LordsAWar - German
Previous by thread: Re: gawk: Wrong behavior in binary mode
Next by thread: Re: gawk: Wrong behavior in binary mode
Index(es):
- Date
- Thread