bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug-gawk] Help ourself


From: Andrew J. Schorr
Subject: Re: [bug-gawk] Help ourself
Date: Fri, 26 Apr 2019 09:18:07 -0400
User-agent: Mutt/1.5.21 (2010-09-15)

On Fri, Apr 26, 2019 at 11:23:55AM +0700, Budi wrote:
> FS=" *"
> How to get automatically FS to be a space, or 3 spaces, or 6 spaces
> according the regex result at that very point in time ???
> 
> just like RT feature for record !

I'm guessing that you want to know the actual value of the field separator
that resulted from parsing the string. I don't believe there's a way to
see these values for $0, but you can use the 4th argument of the split function
to see separator strings. From the man page:

       split(s, a [, r [, seps] ])
                               Split  the  string  s  into the array a and the
                               separators array seps on the regular expression
                               r,  and  return  the number of fields.  If r is
                               omitted, FS is used instead.  The arrays a  and
                               seps  are  cleared first.  seps[i] is the field
                               separator matched by r between a[i] and a[i+1].
                               If r is a single space, then leading whitespace
                               in s goes into the extra array element  seps[0]
                               and  trailing  whitespace  goes  into the extra
                               array element seps[n], where n  is  the  return
                               value  of  split(s,  a,  r,  seps).   Splitting
                               behaves   identically   to   field   splitting,
                               described above.  In particular, if r is a sin‐
                               gle-character string, that string acts  as  the
                               separator,  even  if it happens to be a regular
                               expression metacharacter.

So you could call:

{
        split($0, f, FS, seps)
        printf "Leading discarded junk: [%s]\n", seps[0]
        for (i = 1; i <= NF; i++)
                printf "Field %d [%s] terminated by [%s]\n", i, $i, seps[i]
}

There's more extensive documentation here:
   https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html

Of course, this is inefficient because $0 is parsed twice...

Regards,
Andy



reply via email to

[Prev in Thread] Current Thread [Next in Thread]