bug-parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU Parallel Bug Reports Signal SIGCHLD received, but no signal hand


From: Rick Masters
Subject: Re: GNU Parallel Bug Reports Signal SIGCHLD received, but no signal handler set
Date: Sat, 22 Oct 2016 22:40:44 +0000


Ole,

The time varies and it may take more than 5 minutes.  Today, I've run it over and over and here are the times it took to fail (minutes:seconds):
0:56
6:27
14:19
0:49
2:51
3:45
10:29
1:32
5:16
1:55

That seems a little longer than I remember last time but maybe I was lucky before. You might want to let it run overnight if you want to make sure you've given it enough time. If that doesn't do it then something else is too different between us.

The host details for me are:
Windows 7 on an Intel i7 4770 (4 core) processor w/32 GB memory.
Virtual Box 5.1.8 using the stock Centos 6.8, 32-bit VM from osboxes.org, as you suggested.

I launched the VM with 1 processor and 512 MB memory. I had to click the "Enable PAE" setting on the CPU for it to boot correctly.
After boot, I run two instances of Terminal.
On the first terminal, I run sigtest.pl. On the second terminal, I run sigsend.sh.

sigtest.pl is:

#!/bin/perl

while (1) {
  print "adding handler\n";
  $SIG{CHLD} = sub { print "gotchild\n"; };
  print "deleting handler\n";
  delete $SIG{CHLD};
}

sigsend.sh is:

#!/bin/bash

sigpid=$(ps -u osboxes | grep sigtest | awk '{print $1}')
echo $sigpid
while kill -SIGCHLD $sigpid; do
  true
done

I made both scripts executable and ran them directly like:
./sigtest.pl

It is worth noting that I ran the parallel test I described before on the Virtual Box VM and the problem did not happen after running for several days straight. Since it reproduces so easily for me on the hardware I have available, I'm thinking the virtual box environment is just much less (like 100 times less) likely to reproduce the problem for some reason. The machines that I have been using are powerful, with at least 20 cores and lots of memory. The test above reproduces the problem in less than a second on those machines (but I think they are centos 6.6 so not exactly the same). Anyway,  I'm not sure what hardware you are using but it might not be enough.

One way to get us on the same environment with a more powerful machine would be to use a public cloud where we can spin up a specific VM flavor with multiple cores. Most clouds also support automation that would allow me to hand you an infrastructure script or template (cloud formation, terraform, etc) that automatically spins up the machine and reproduces the problem. If you have an account or would be willing to use one, let me know which provider and maybe I can reproduce it for you that way.

Rick



From: address@hidden <address@hidden> on behalf of Ole Tange <address@hidden>
Sent: Saturday, October 22, 2016 8:55 AM
To: Rick Masters
Cc: address@hidden
Subject: Re: GNU Parallel Bug Reports Signal SIGCHLD received, but no signal handler set
 
On Wed, Oct 19, 2016 at 2:42 AM, Rick Masters <address@hidden> wrote:
> Ole,
>
> I was able to repro the underlying perl issue with the Centos 6.8  32 -bit virtual box.
> I used the same perl test but I noticed that it does not repro nearly as often.
> But, I have made it happen repeatedly.

How long does it have to run? I let mine run for 5 full minutes. Is
there anything we can do to make it happen faster?

> BTW, I'm not sure if you wanted me to send you the virtual box image.
> It's gigabytes in size and would only have the two scripts in it (that you already have) so I'm assuming that's not necessary at this point.

If you can use one if the boxes on
https://virtualboxes.org/images/centos/ and just tell me what you have
virtualboxes.org
Images for several CentOS flavours are available. CentOS 5.1 Size (compressed/uncompressed): 620 MBytes / 2.53 GBytes Link: Active user account(s)(username/password): root/roottoor, centos/reverse …


changed, then that should work, too. And if that does not work then I
probably need the exact box you are using.

/Ole

reply via email to

[Prev in Thread] Current Thread [Next in Thread]