[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CSV extension status
From: |
Manuel Collado |
Subject: |
Re: CSV extension status |
Date: |
Tue, 18 May 2021 16:41:24 +0200 |
User-agent: |
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.5.0 |
El 18/05/2021 a las 14:56, Andrew J. Schorr escribió:
...
I think I understand the conceptual problem, but I feel as if maybe we're
letting the perfect be the enemy of the good.
Agreed.
In 99.9% of the cases where I use
CSV files, I simply want to have read-only access to the fields. Actually, if
I'm being honest, it's 100%. In other words, I want to be able to say something
like:
gawk -lcsv '
NR == 1 {
for (i = 1; i <= NF; i++)
m[$i] = i
next
}
$m["age"] > 30 {
sum += $m["weight"]
n++
}
END {
printf "found %d people over 30 with an average weight of %.3f\n",
n, (n? sum/n : 0)
}'
Can this be done without a library?
Do you mean without an API-based extension? Yes.
I thought that the possibility of embedded
newlines meant that we needed a library for this rather than a simple FPAT
solution. Maybe I'm confused.
A pure gawk library is enough to effectively process CSV data. By using
my CSVMODE library from http://mcollado.z15.es/xgawk/ your example can
be coded almost verbatim:
gawk -i csvmode-1 '
NR==1 {next}
csvfield("age") > 30 {
sum += csvfield("weight")
n++
}
END {
printf "found %d people over 30 with an average weight of %.3f\n",
n, (n? sum/n : 0)
}'
And this code works with fields quoted, unquoted or with embedded
newlines. This is why I'm unsure if an API-based gawk-csv extension is
really needed.
How about also hosting pure gawk libraries, like CSVMODE, in the
gawkextlib site? Arnold suggested this sometime ago.
Regards.
--
Manuel Collado - http://mcollado.z15.es
- CSV extension status, Ed Morton, 2021/05/16
- Re: CSV extension status, Manuel Collado, 2021/05/17
- Re: CSV extension status, Andrew J. Schorr, 2021/05/17
- Re: CSV extension status, Manuel Collado, 2021/05/17
- Re: CSV extension status, Andrew J. Schorr, 2021/05/18
- Re: CSV extension status,
Manuel Collado <=
- Re: CSV extension status, Andrew J. Schorr, 2021/05/18
- Re: CSV extension status, Manuel Collado, 2021/05/18
- Re: CSV extension status, Manuel Collado, 2021/05/19
- Re: CSV extension status, Andrew J. Schorr, 2021/05/19
- Re: CSV extension status, Andrew J. Schorr, 2021/05/19
- Re: CSV extension status, Manuel Collado, 2021/05/19
- Re: CSV extension status, Andrew J. Schorr, 2021/05/19
- Re: CSV extension status, Manuel Collado, 2021/05/19
- Re: CSV extension status, Andrew J. Schorr, 2021/05/19
- Re: CSV extension status, Andrew J. Schorr, 2021/05/19