[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Insertion of extra OFS character into output string
From: |
H |
Subject: |
Re: Insertion of extra OFS character into output string |
Date: |
Tue, 14 Mar 2023 15:06:26 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 |
On 03/14/2023 02:58 AM, Neil R. Ormos wrote:
> H wrote:
>> "Neil R. Ormos" wrote:
>>> H wrote:
>>>> I am a newcomer to awk and have run into an
>>>> issue I have not figured out yet... My
>>>> platform is CentOS 7 running awk 4.0.2, the
>>>> default version. [...]
>>> I don't have 4.0.2 available to test, but I
>>> tested with older and newer versions.
>>> When I test, I get the result I think I expect
>>> from the code you posted. [...]
>>> It would be easier to help if you would please provide:
>>> the simplest input line that reproduces the problem;
>>> the output you expect; and
>>> the output you are getting.
>> I am not on my computer but typing this on my
>> phone. With that caveat, a /minimal/ example
>> would be:
>> echo "Alpha,Beta,Charlie,Delta" | awk 'BEGIN{FS=",";
>> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}'
>> I would expect to see:
>> Alpha<TAB>Beta<TAB>Charlie<TAB>Delta
>> but instead see
>> Alpha<TAB><TAB>Beta<TAB>Charlie<TAB>Delta
>> If you change $1=$1 to $2=$2 you will find that the extra tab character then
>> moves to the next field.
>> Can anyone try this with the most recent version of awk?
> I tested with four versions of Gawk:
> GNU Awk 3.1.7
> GNU Awk 4.1.1
> GNU Awk 4.1.4
> GNU Awk 5.2.0
>
> and among those versions was able to reproduce the behavior that is vexing
> you only in version 4.1.1.
>
> It appears that issue was fixed no later than version 4.1.4.
>
> Version 5.2.0 is fairly recent but not the latest, and, in any case, does not
> exhibit the problem you have experienced.
>
>> I believe I had also tried without the
>> definition of FS with the same result. Finally,
>> note that the FPAT expression comes from the awk
>> documentation and is thus expected to work.
> I wasn't saying that setting FS was causing the problem. Just that setting
> FS would be overridden by the subsequent setting of FPAT.
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 3.1.7
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=",";
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' |
> hexdump $hexdumparg:q
> 0 0 | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |
> A l p h a \t B e
> 8 8 | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |
> t a \t C h a r l
> 10 16 | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |
> i e \t D e l t a
> 18 24 | 0a | 010 |
> \n
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 4.1.1, API: 1.1 (GNU MPFR 3.1.2-p3, GNU MP 6.0.0)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=",";
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' |
> hexdump $hexdumparg:q
> 0 0 | 41 6c 70 68 61 09 09 42 | 065 108 112 104 097 009 009 066 |
> A l p h a \t \t B
> 8 8 | 65 74 61 09 43 68 61 72 | 101 116 097 009 067 104 097 114 |
> e t a \t C h a r
> 10 16 | 6c 69 65 09 44 65 6c 74 | 108 105 101 009 068 101 108 116 |
> l i e \t D e l t
> 18 24 | 61 0a | 097 010 |
> a \n
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 4.1.4, API: 1.1 (GNU MPFR 3.1.5, GNU MP 6.1.2)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=",";
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' |
> hexdump $hexdumparg:q
> 0 0 | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |
> A l p h a \t B e
> 8 8 | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |
> t a \t C h a r l
> 10 16 | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |
> i e \t D e l t a
> 18 24 | 0a | 010 |
> \n
>
> ========================================
>
> gawk --version | head -1
> GNU Awk 5.2.0, API 3.2, PMA Avon 7, (GNU MPFR 3.1.5, GNU MP 6.1.2)
>
> echo "Alpha,Beta,Charlie,Delta" | gawk 'BEGIN{FS=",";
> FPAT="([^,]*)|(\"[^\"]+\")"; OFS="\t"} {$1=$1; gsub(/"/, ""); print}' |
> hexdump $hexdumparg:q
> 0 0 | 41 6c 70 68 61 09 42 65 | 065 108 112 104 097 009 066 101 |
> A l p h a \t B e
> 8 8 | 74 61 09 43 68 61 72 6c | 116 097 009 067 104 097 114 108 |
> t a \t C h a r l
> 10 16 | 69 65 09 44 65 6c 74 61 | 105 101 009 068 101 108 116 097 |
> i e \t D e l t a
> 18 24 | 0a | 010 |
> \n
>
> ========================================
>
OK, thank you for looking into this.