[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Changing $0 after gsub() breaks output
From: |
Yasuhiro Yamada |
Subject: |
Re: Changing $0 after gsub() breaks output |
Date: |
Sun, 29 Jan 2023 14:18:10 +0000 |
Thanks for identifying and fixing the problem.
> + * (Test case was gawk 'gusb(/./, "@") && $0=$1'). So
> we save
I noticed a minor typo by the way.
s/gusb/gsub/
Thanks,
On Sun, Jan 29, 2023 at 7:53 AM <arnold@skeeve.com> wrote:
>
> Hi.
>
> Thank you for the bug report and easy reproducer. The patch
> below fixes the problem. It will be in Git in the next
> day or two.
>
> Arnold
>
> Yasuhiro Yamada <yamada@gr3.ie> wrote:
>
> > Hi.
> > Executing gsub() without an action and making changes to $0 will
> > corrupt the output.
> > This behavior seems like a bug.
> >
> > $ ./gawk --version
> > GNU Awk 5.2.1, API 3.2, PMA Avon 8-g1
> > ...
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1'
> > q <===== broken output
> >
> > Interestingly, the result is different for each run.
> >
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 b0 61 99 0a
> > 260 a 231 \n
> > 0000004
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 c0 e7 f7 0a
> > 300 347 367 \n
> > 0000004
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 c0 07 14 0a
> > 300 \a 024 \n
> > 0000004
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 c0 57 76 0a
> > 300 W v \n
> > 0000004
> >
> > Older versions do NOT reproduce this issue.
> > Also, the outputs are as expected and intuitive.
> >
> > $ ./gawk --version
> > GNU Awk 4.1.4, API: 1.1
> > ...
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1'
> > @@@
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 40 40 40 0a
> > @ @ @ \n
> > 0000004
> >
> > This issue seems to occur in v4.2.0 and later.
> >
> > $ ./gawk --version
> > GNU Awk 4.2.0, API: 2.0
> > ...
> > $ echo abc | ./gawk 'gsub(".","@") && $0=$1' | od -tx1c
> > 0000000 e0 4f fc 0a
> > 340 O 374 \n
> > 0000004
> >
> > My environment is
> >
> > $ uname -a
> > Linux ip-172-31-9-222 5.4.0-1093-aws #102~18.04.2-Ubuntu SMP Wed
> > Dec 7 00:31:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
> > $ gcc --version
> > gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
>
> ---------------------------------------------
> diff --git a/interpret.h b/interpret.h
> index 4540d302..5adb229b 100644
> --- a/interpret.h
> +++ b/interpret.h
> @@ -872,16 +874,30 @@ mod:
> break;
>
> case Op_assign:
> + {
> + NODE *save_lhs;
> +
> lhs = POP_ADDRESS();
> r = TOP_SCALAR();
> - unref(*lhs);
> + /*
> + * 1/2023:
> + * The old NODE pointed to by *lhs has to be freed.
> + * But we can't free it too early, in case it's both
> $0 and $1
> + * (Test case was gawk 'gusb(/./, "@") && $0=$1'). So
> we save
> + * the old one, and after the assignment, we free it,
> since
> + * $0 and $1 have the same stptr value but only $0
> has MALLOC
> + * in the flags. Whew!
> + */
> + save_lhs = *lhs;
> if (r->type == Node_elem_new) {
> DEREF(r);
> r = dupnode(Nnull_string);
> }
> UPREF(r);
> UNFIELD(*lhs, r);
> + unref(save_lhs);
> REPLACE(r);
> + }
> break;
>
> case Op_subscript_assign: