bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CSV extension status


From: Manuel Collado
Subject: Re: CSV extension status
Date: Tue, 25 May 2021 10:09:10 +0200
User-agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0

El 25/05/2021 a las 4:14, Ed Morton escribió:
I see the conversation has continued at bug-gawk and Arnold had
suggested spinning it off into an email chain which, if it happened, I'm
not on. I see a lot of complexity being discussed in the thread that
just doesn't seem to be necessary. Is there any reason why the simple
"buildRec()" function I posted at
https://stackoverflow.com/a/45420607/1745001 (and which could be written
more concisely if I used gawk extensions) isn't all we'd need to parse
CSVs? No modes, no extra/ambiguous terminology - just reading a CSV into
fields by calling 1 function each time a record is read.

The goal is to allow beginner awk users to process CSV data as if they were regular awk records. No need to tamper with predefined variables like FS, OFS, NR, FPAT etc. Just put -i csvmode in the command line or add @include "csvmode" to the script.

By using the CSVMODE library your example becomes:

$ cat decsv2.awk
{
    printf "Record %d:\n", NR
    for (i=1;i<=NF;i++) {
        # To replace newlines with blanks add gsub(/\n/," ",$i) here
        printf "    $%d=<%s>\n", i, $i
    }
    print "----"
}

$ gawk -icsvmode-1 -f decsv2 file.csv
Record 1:
    $1=<rec1, fld1>
    $2=<>
    $3=<rec1","fld3.1
",
fld3.2>
    $4=<rec1
fld4>
----
Record 2:
    $1=<rec2, fld1.1

fld1.2>
    $2=<rec2 fld2.1"fld2.2"fld2.3>
    $3=<>
    $4=<rec2 fld4>
----

Please note that the modified decsv2.awk script is not CSV specific. It can be used unmodified to process regular awk records.

Of course, different users have different needs and taste. This is why the library in question attempts to satisfy as much users as possible, by offering a rich set of configuration options.

Even more. The library allows to modify fields and records the usual way. For instance, to add a new field "val" at position "pos":

  if (pos>NF) {$pos = val} else {$pos = val OFS $pos}; $0 = $0;

And this code works transparently for both CSV data and regular text data.

Regards.

--
Manuel Collado - http://mcollado.z15.es



reply via email to

[Prev in Thread] Current Thread [Next in Thread]