[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug-gawk] Help ourself
From: |
Andrew J. Schorr |
Subject: |
Re: [bug-gawk] Help ourself |
Date: |
Fri, 26 Apr 2019 09:18:07 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Fri, Apr 26, 2019 at 11:23:55AM +0700, Budi wrote:
> FS=" *"
> How to get automatically FS to be a space, or 3 spaces, or 6 spaces
> according the regex result at that very point in time ???
>
> just like RT feature for record !
I'm guessing that you want to know the actual value of the field separator
that resulted from parsing the string. I don't believe there's a way to
see these values for $0, but you can use the 4th argument of the split function
to see separator strings. From the man page:
split(s, a [, r [, seps] ])
Split the string s into the array a and the
separators array seps on the regular expression
r, and return the number of fields. If r is
omitted, FS is used instead. The arrays a and
seps are cleared first. seps[i] is the field
separator matched by r between a[i] and a[i+1].
If r is a single space, then leading whitespace
in s goes into the extra array element seps[0]
and trailing whitespace goes into the extra
array element seps[n], where n is the return
value of split(s, a, r, seps). Splitting
behaves identically to field splitting,
described above. In particular, if r is a sin‐
gle-character string, that string acts as the
separator, even if it happens to be a regular
expression metacharacter.
So you could call:
{
split($0, f, FS, seps)
printf "Leading discarded junk: [%s]\n", seps[0]
for (i = 1; i <= NF; i++)
printf "Field %d [%s] terminated by [%s]\n", i, $i, seps[i]
}
There's more extensive documentation here:
https://www.gnu.org/software/gawk/manual/html_node/String-Functions.html
Of course, this is inefficient because $0 is parsed twice...
Regards,
Andy