[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: gawk 5.2.2 fatal crash when closing a two-way pipe for a process tha
From: |
Andrew J. Schorr |
Subject: |
Re: gawk 5.2.2 fatal crash when closing a two-way pipe for a process that does not have a pid anymore |
Date: |
Thu, 24 Aug 2023 10:55:23 -0400 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
Hi,
I made a reproducer. It's not so hard. :-)
Using the master branch:
bash-4.2$ cat /tmp/bug.gawk
BEGIN {
cmd = "ssh `hostname` uptime"
print "hello" |& cmd
system("ps -ef | grep ssh")
print "sleeping while waiting for ssh to exit"
sleep(1)
print "another write after process is gone" |& cmd
system("ps -ef | grep ssh")
print "closing now"
close(cmd)
}
bash-4.2$ ./gawk -l extension/.libs/time.so -f /tmp/bug.gawk
schorr 3684 1 0 2021 ? 00:00:00 ssh-agent
root 19396 26130 0 10:34 ? 00:00:00 sshd: schorr [priv]
schorr 19399 19396 0 10:34 ? 00:00:01 sshd: schorr
schorr 22087 22086 0 10:48 pts/9 00:00:00 ssh ti139 uptime
schorr 22088 22086 0 10:48 pts/9 00:00:00 sh -c ps -ef | grep ssh
schorr 22091 22088 0 10:48 pts/9 00:00:00 grep ssh
root 24832 26130 0 Apr14 ? 00:00:00 sshd: schorr [priv]
schorr 24834 24832 0 Apr14 ? 00:00:05 [sshd] <defunct>
root 26130 1 0 2021 ? 00:00:37 /usr/sbin/sshd -D
sleeping while waiting for ssh to exit
schorr 3684 1 0 2021 ? 00:00:00 ssh-agent
root 19396 26130 0 10:34 ? 00:00:00 sshd: schorr [priv]
schorr 19399 19396 0 10:34 ? 00:00:01 sshd: schorr
schorr 22087 22086 1 10:48 pts/9 00:00:00 [ssh] <defunct>
schorr 22121 22086 0 10:48 pts/9 00:00:00 sh -c ps -ef | grep ssh
schorr 22123 22121 0 10:48 pts/9 00:00:00 grep ssh
root 24832 26130 0 Apr14 ? 00:00:00 sshd: schorr [priv]
schorr 24834 24832 0 Apr14 ? 00:00:05 [sshd] <defunct>
root 26130 1 0 2021 ? 00:00:37 /usr/sbin/sshd -D
closing now
gawk: /tmp/bug.gawk:10: fatal: flush to "ssh `hostname` uptime" failed: reason
unknown
The defunct process (pid 22087) doesn't seem to be relevant. As you noted,
gawk gives a fatal error.
I'm not sure that this is actually a bug. Have you considered using
non-fatal I/O?
If I add 'PROCINFO["NONFATAL"] = 1' to the script, it no longer gives
a fatal error. Or limit it to the command in question:
BEGIN {
cmd = "ssh `hostname` uptime"
PROCINFO[cmd, "NONFATAL"] = 1
print "hello" |& cmd
system("ps -ef | grep ssh")
print "sleeping while waiting for ssh to exit"
sleep(1)
print "another write after process is gone" |& cmd
system("ps -ef | grep ssh")
print "closing now"
close(cmd)
}
Why do you think it's a bug?
Regards,
Andy
On Thu, Aug 24, 2023 at 01:31:24PM +0000, Finn Magnusson wrote:
> Hi
> I wish I could manage to reproduce the issue with a simple recipe.
> But whichever way I try to close the process associated with the two-way pipe,
> it stays as a defunct process and then the issue does not occur.
> The only way I get the issue is in my program where I start a two-way pipe
> toward a ssh client which opens to a netconf session on a remote machine. On
> the remote machine, I issue a command to close the netconf session. This
> causes
> the ssh client to close down completely on my machine and no defunct process
> remains. Then when using the close() function in gawk to close the two-way
> pipe
> it crashes because the ssh client process does not exist anymore, not even as
> a
> defunct process.
> So that is not so easy to reproduce outside of my environment since the
> netconf
> server that I use is a proprietary system here at the company where I work.
> In case you make a fix I can always try it in my environment and let you know
> whether it solved the issue.
> If that is not satisfactory then feel free to discard this bug report until I
> found a way to reproduce it that could be done in any environment.
> Many thanks.
> BR
> Finn
>
> On Thursday, August 24, 2023 at 03:06:37 PM GMT+2, Andrew J. Schorr
> <aschorr@telemetry-investments.com> wrote:
>
>
> Hi,
>
> Thanks for the bug report. Can you please provide a simple recipe
> for how to reproduce this problem?
>
> Thanks,
> Andy
>
> On Thu, Aug 24, 2023 at 09:56:54AM +0000, Finn Magnusson via Bug reports only
> for gawk. wrote:
> > Dear gawk developers
> > I noticed the below issue in gawk 5.2.2 which was not present in previous
> gawk version I was using (5.1.1): when using the close() function to close a
> two-way pipe to a process that does not have a PID anymore (e.g. due to the
> process got closed by an external command), then I got the below fatal crash:
> > gawk.lin64: /app/moshell/23.2h/moshell/prog.awk:19919: fatal: flush to
> > "/app/
> moshell/23.2h/moshell/commonjars/ssh.lin64 -p 2022 -z '/proj/wcdma-userarea/
> users/eanzmagn/moshell_logfiles/logs_moshell/tempfiles/20230824-114538_6552/
> sshz6592' -l expert -o StrictHostKeyChecking=no -o
> UserKnownHostsFile=/dev/null
> -o HostKeyAlgorithms="ssh-dss,ssh-rsa,rsa-sha2-512,rsa-sha2-256" -o
> NumberOfPasswordPrompts=1 -o ConnectTimeout=10 -o ServerAliveInterval=300 -o
> ConnectionAttempts=1 -o ServerAliveCountMax=0 -o TCPKeepAlive=no -o
> PreferredAuthentications=publickey,password 10.136.72.120 -s netconf 2>&1"
> failed: reason unknown
> >
> > I was able to solve it by commenting out the below efflush statement in
> gawk-5.2.2/io.c : /* flush before closing to leverage special error handling
> *
> / efflush(rp->output.fp, "flush", rp);
> > Is it possible to make a fix for this in a coming gawk release?
> > Many thanks.BRFinn