bug-gawk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The best way to convert space separated text to TSV?


From: Peng Yu
Subject: Re: The best way to convert space separated text to TSV?
Date: Tue, 11 Feb 2020 03:20:15 -0600

In wc output, the last column can have spaces. The total number of columns
must be specified to be sure which is the last column.

On Tue, Feb 11, 2020 at 3:11 AM Peter Brooks <address@hidden>
wrote:

> Unfortunately, that doesn't work for embedded spaces:
>
> echo '  a    "b b"  cc    dd' | gawk -v OFS='\t' '{ $1 = $1 ; print }' |
> od -c
> 0000000   a  \t   "   b  \t   b   "  \t   c   c  \t   d   d  \n
> 0000016
>
>
> On Tue, 11 Feb 2020 at 08:26, <address@hidden> wrote:
>
>> $ echo '  a    bb  cc    dd' | gawk -v OFS='\t' '{ $1 = $1 ; print }' |
>> od -c
>> 0000000   a  \t   b   b  \t   c   c  \t   d   d  \n
>> 0000013
>>
>> Peng Yu <address@hidden> wrote:
>>
>> > Hi,
>> >
>> > Many programs (such as wc and ps) print results in tables with one or
>> > more spaces as separators. But the last column allows spaces in them.
>> > To process the output of wc, I came up with the following code
>> > (sometimes I need to manually change the display name such as
>> > "file1"). But it is too verbose.
>> >
>> > BEGIN {
>> >       OFS = "\t"
>> >       for(i=1;i<ARGC;++i) {
>> >               fnames[i] = ARGV[i]
>> >       }
>> >       nfiles = ARGC - 1
>> >       delete ARGV
>> > }
>> > {
>> >       match($0, /^[ ]*/)
>> >       line = substr($0, RSTART+RLENGTH)
>> >       NF = 1
>> >       for(i=1; i<=n; ++i) {
>> >               if(match(line, /[ ]+/)) {
>> >                       $i = substr(line, 1, RSTART-1)
>> >                       line = substr(line, RSTART+RLENGTH)
>> >               }
>> >       }
>> >       if(NR <= nfiles) {
>> >               $i = fnames[NR]
>> >       } else {
>> >               if(line "") $i = line
>> >       }
>> >       print
>> > }
>> >
>> > $ awk -v n=2 -f ./wc.awk file1 <<EOF
>> >  a bb c
>> > aa  b c
>> > EOF
>> >
>> > $ awk -v n=3 -f ./wc.awk <<EOF
>> >  a bb c
>> > EOF
>> >
>> > What is the most succinct way to convert such kind of input to TSV
>> > format with gawk? Thanks.
>> >
>> > --
>> > Regards,
>> > Peng
>>
>>
>
> --
> Peter Brooks
>
> Skype:  Fustbariclation
> Twitter: Fustbariclation
> Author Page: amazon.com/author/peter_brooks
>
-- 
Regards,
Peng


reply via email to

[Prev in Thread] Current Thread [Next in Thread]