[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
gawk: Wrong behavior in binary mode
From: |
Carlos G. |
Subject: |
gawk: Wrong behavior in binary mode |
Date: |
Mon, 8 Dec 2008 23:27:51 -0200 |
Hi... I think this is a bug.
When working with gawk in binary mode, the length() and index() built-ins
fail with character codes greater than 127(0x7f). For example:
$ printf "\x80\x81\x82\x83" | gawk 'BEGIN { BINMODE=1 } { print length }'
0
But it should print 4.
I've found that this happens because gawk assumes multi-byte in those cases.
The following patch worked fine for me:
--- builtin.old 2008-12-08 23:10:29.000000000 -0200
+++ gawk-3.1.6/builtin.c 2008-12-08 23:07:28.000000000 -0200
@@ -331,7 +331,7 @@
if (l2 > l1)
break;
#ifdef MBS_SUPPORT
- if (gawk_mb_cur_max > 1) {
+ if (!BINMODE && gawk_mb_cur_max > 1) {
const wchar_t *pos;
s1 = force_wstring(s1);
@@ -370,7 +370,7 @@
break;
}
#ifdef MBS_SUPPORT
- if (gawk_mb_cur_max > 1) {
+ if (!BINMODE && gawk_mb_cur_max > 1) {
const wchar_t *pos;
s1 = force_wstring(s1);
@@ -457,8 +457,9 @@
if (do_lint && (tmp->flags & (STRING|STRCUR)) == 0)
lintwarn(_("length: received non-string argument"));
tmp = force_string(tmp);
+
#ifdef MBS_SUPPORT
- if (gawk_mb_cur_max > 1) {
+ if (!BINMODE && gawk_mb_cur_max > 1) {
tmp = force_wstring(tmp);
len = tmp->wstlen;
} else