bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Gawk match() strange behaviour


From: Aharon Robbins
Subject: Re: Gawk match() strange behaviour
Date: Sat, 08 Sep 2007 23:18:16 +0300

Yes, locales definitely complicate issues. I am glad that things work
with gawk-stable from CVS; that means I have done my job correctly.

Arnold

> Date: Thu, 06 Sep 2007 23:07:01 +0200
> From: Alain Ketterlin <address@hidden>
> Subject: Re: Gawk match() strange behaviour
> To: Aharon Robbins <address@hidden>
> Cc: address@hidden
>
> Hi, thanks for your help.
>
> >> The following program:
> >>
> >> {
> >>      r = match($0,/^ */,t);
> >>      print "R=" r " S=" RSTART " L=" RLENGTH;
> >> }
> >>
> >> produces this (< signals  input, > signals output)
> >> <
> >> > R=-1208966831 S=-1208966831 L=1208966850
> >> < random
> >> > R=1 S=1 L=34
> >> <  random
> >> > R=1 S=1 L=2
>
> > I could not reproduce this using either stock gawk 3.1.5 or the current CVS
> > sources.  I suggest that you try building from scratch from the CVS archive
> > on savannah.gnu.org.
> >
> > For the empty line I get
> >
> >     R=1 S=1 L=0
>
> Things are getting strange (for me, I mean :). I just noticed that
> the locale has an impact.
>
> With gawk-3.1.5 (compiled from the tarball), under en_US.utf-8 I get:
> -from an empty line: R=1 S=1 L=18
> -from a line containing "random" (no space at beginning): R=1 S=1 L=34
> -from "  random" (two spaces at beginning): R=1 S=1 l=2 (correct)
> Under en_US.iso-8859-1, everything is ok. So it seems that utf-8
> input is the problem.
>
> With gawk-stable checked out from savannah, everything is correct,
> under both locales.
>
> Thanks for your help.
>
> -- Alain.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]