[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Clang-built Gawk 5.2.1 regex oddity
From: |
Paul Eggert |
Subject: |
Re: Clang-built Gawk 5.2.1 regex oddity |
Date: |
Sun, 1 Jan 2023 22:10:28 -0800 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.4.2 |
This is a serious bug in Clang: it generates incorrect machine code.
The code that Clang generates for the following (gawk/support/dfa.c
lines 1141-1143):
((dfa->syntax.dfaopts & DFA_CONFUSING_BRACKETS_ERROR
? dfaerror : dfawarn)
(_("character class syntax is [[:space:]], not [:space:]")));
is immediately followed by the code generated for the following
(gawk/support/dfa.c line 1015):
dfaerror (_("invalid character class"));
and this is incorrect because the two source code regions are not
connected with each other.
You can see the bug in the attached (compressed) file dfa.s which
contains the assembly language output. Here's the dfa.s file starting
with line 6975:
6975 testb $4, 456(%r12)
6976 movl $dfawarn, %eax
6977 movl $dfaerror, %ebx
6978 cmoveq %rax, %rbx
6979 movl $.L.str.26, %esi
6980 xorl %edi, %edi
6981 movl $5, %edx
6982 callq dcgettext
6983 movq %rax, %rdi
6984 callq *%rbx
6985 .LBB34_144:
6986 movl $.L.str.25, %esi
6987 xorl %edi, %edi
6988 movl $5, %edx
6989 callq dcgettext
6990 movq %rax, %rdi
6991 callq dfaerror
Line 6984, which is source lines 1141-1143 call to either dfaerror or
dfawarn, is immediately followed by the code for source line 1015. This
means that at runtime when dfawarn returns the code immediately calls
dfaerror, which is incorrect.
My guess is that Clang got confused because dfaerror is declared
_Noreturn, so Clang mistakenly assumed that dfawarn is also _Noreturn,
which it is not.
I worked around the Clang bug by installed the attached patch into
Gnulib. Please give it a try with Gawk.
Incorrect code generation is a serious bug in Clang; can you please
report it to the Clang folks? I am considering using a bigger hammer,
and doing this:
#define _Noreturn /*empty*/
whenever Clang is used, until the bug is fixed.
This is because if the bug occurs here it's likely that similar bugs
will occur elsewhere and this sort of thing can be really subtle and
hard to catch or work around in general. Clang really needs to get this
fixed.
Thanks.
dfa.s.gz
Description: application/gzip
0001-dfa-work-around-Clang-15-bug.patch
Description: Text Data