bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

gawk: length return incorrect value when MB_CUR_MAX > 1


From: KIMURA Koichi
Subject: gawk: length return incorrect value when MB_CUR_MAX > 1
Date: Wed, 30 Nov 2005 09:29:56 +0900

Hi,

A certain user found the bug of gawk 3.1.5's length function.

$LANG=ja_JP.utf8 gawk 'BEGIN {print length("abc\0def")}'

This script prints '3', not '7'. I have tested Windows and GNU/Linux
(Fedora Core3).

In the place where I examined it, mbrtowc function seems don't convert 
'\0' character (return 0).

Here is a patch.

--- node.c.1~   2005-07-27 03:07:43.000000000 +0900
+++ node.c      2005-11-27 04:18:49.000000000 +0900
@@ -745,7 +745,13 @@
        src_count = n->stlen;
        memset(& mbs, 0, sizeof(mbs));
        for (i = 0; src_count > 0; i++) {
-               count = mbrtowc(& wc, sp, src_count, & mbs);
+               if (*sp != '\0') {
+                       count = mbrtowc(& wc, sp, src_count, & mbs);
+               }
+               else { /* NUL character at middle of string */
+                       wc = L'\0';
+                       count = 1;
+               }
                switch (count) {
                case (size_t) -2:
                case (size_t) -1:

Thank you,

-- 
KIMURA Koichi





reply via email to

[Prev in Thread] Current Thread [Next in Thread]