bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gawk match() strange behaviour


From: Alain Ketterlin
Subject: Re: Gawk match() strange behaviour
Date: Thu, 06 Sep 2007 23:07:01 +0200
User-agent: Internet Messaging Program (IMP) H3 (4.1.4) / FreeBSD-6.2


Hi, thanks for your help.

The following program:

{
     r = match($0,/^ */,t);
     print "R=" r " S=" RSTART " L=" RLENGTH;
}

produces this (< signals  input, > signals output)
<
> R=-1208966831 S=-1208966831 L=1208966850
< random
> R=1 S=1 L=34
<  random
> R=1 S=1 L=2

I could not reproduce this using either stock gawk 3.1.5 or the current CVS
sources.  I suggest that you try building from scratch from the CVS archive
on savannah.gnu.org.

For the empty line I get

        R=1 S=1 L=0

Things are getting strange (for me, I mean :). I just noticed that
the locale has an impact.

With gawk-3.1.5 (compiled from the tarball), under en_US.utf-8 I get:
-from an empty line: R=1 S=1 L=18
-from a line containing "random" (no space at beginning): R=1 S=1 L=34
-from "  random" (two spaces at beginning): R=1 S=1 l=2 (correct)
Under en_US.iso-8859-1, everything is ok. So it seems that utf-8
input is the problem.

With gawk-stable checked out from savannah, everything is correct,
under both locales.

Thanks for your help.

-- Alain.







reply via email to

[Prev in Thread] Current Thread [Next in Thread]