bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Quotes being stripped by "--csv"


From: J Naman
Subject: Re: Quotes being stripped by "--csv"
Date: Mon, 27 Nov 2023 12:54:17 -0500

I face many of the same input problems as Neil & Ed, but not being forced
to maintain the defects in my output.
My strategy is now to create very short customized scripts to preprocess
and "standardize" the CSV data from many sources and pipe that to scripts
that deal with the standardized data without formatted anomalies.
Examples of the junk I see are Formfeed control character 0x0C in the
middle of quoted text (pagination after x chars in a buffer); = signs
before CSV fields, e.g. ="123ABC",="$12.34"; high Ascii, esp. en-dash or
em-dash where a minus sign should be; minus signs before and after $ signs,
"-$123","$-123,", daggers, left-& right double quotes. Much more common now
is tab-separated, TSV, downloads from US government sites, mostly without
quoted fields. I have never seen LFs in the middle of any CSV record.
Just my personal way of dealing with complexity, John


reply via email to

[Prev in Thread] Current Thread [Next in Thread]