[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CSV extension status
From: |
Manuel Collado |
Subject: |
Re: CSV extension status |
Date: |
Tue, 25 May 2021 10:09:10 +0200 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 |
El 25/05/2021 a las 4:14, Ed Morton escribió:
I see the conversation has continued at bug-gawk and Arnold had
suggested spinning it off into an email chain which, if it happened, I'm
not on. I see a lot of complexity being discussed in the thread that
just doesn't seem to be necessary. Is there any reason why the simple
"buildRec()" function I posted at
https://stackoverflow.com/a/45420607/1745001 (and which could be written
more concisely if I used gawk extensions) isn't all we'd need to parse
CSVs? No modes, no extra/ambiguous terminology - just reading a CSV into
fields by calling 1 function each time a record is read.
The goal is to allow beginner awk users to process CSV data as if they
were regular awk records. No need to tamper with predefined variables
like FS, OFS, NR, FPAT etc. Just put -i csvmode in the command line or
add @include "csvmode" to the script.
By using the CSVMODE library your example becomes:
$ cat decsv2.awk
{
printf "Record %d:\n", NR
for (i=1;i<=NF;i++) {
# To replace newlines with blanks add gsub(/\n/," ",$i) here
printf " $%d=<%s>\n", i, $i
}
print "----"
}
$ gawk -icsvmode-1 -f decsv2 file.csv
Record 1:
$1=<rec1, fld1>
$2=<>
$3=<rec1","fld3.1
",
fld3.2>
$4=<rec1
fld4>
----
Record 2:
$1=<rec2, fld1.1
fld1.2>
$2=<rec2 fld2.1"fld2.2"fld2.3>
$3=<>
$4=<rec2 fld4>
----
Please note that the modified decsv2.awk script is not CSV specific. It
can be used unmodified to process regular awk records.
Of course, different users have different needs and taste. This is why
the library in question attempts to satisfy as much users as possible,
by offering a rich set of configuration options.
Even more. The library allows to modify fields and records the usual
way. For instance, to add a new field "val" at position "pos":
if (pos>NF) {$pos = val} else {$pos = val OFS $pos}; $0 = $0;
And this code works transparently for both CSV data and regular text data.
Regards.
--
Manuel Collado - http://mcollado.z15.es
Re: CSV extension status, Ed Morton, 2021/05/17
- Re: CSV extension status, Manuel Collado, 2021/05/17
- Re: CSV extension status, Ed Morton, 2021/05/17
- Re: CSV extension status, Ed Morton, 2021/05/24
- Re: CSV extension status,
Manuel Collado <=
- Re: CSV extension status, Ed Morton, 2021/05/25
- Re: CSV extension status, arnold, 2021/05/26
- Re: CSV extension status, Ed Morton, 2021/05/28