bug-parallel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: GNU Parallel reports Deep recursion on subroutine "main::get_job_wit


From: Ole Tange
Subject: Re: GNU Parallel reports Deep recursion on subroutine "main::get_job_with_sshlogin"
Date: Thu, 20 Jan 2011 17:32:30 +0100

2011/1/20  <address@hidden>:
>> 2011/1/20  <address@hidden>:
>>
>>> Deep recursion on subroutine "main::get_job_with_sshlogin" at
>>> /usr/bin/parallel line 988, <STDIN> line 64964.
>
> Here is a test example.
>
> This is the command:
> ./generate.sh | parallel --retries 17 --sshloginfile machines -j +0
> --progress --nice 19 "echo {} | ./batch_dspath.sh" > output.txt

Great. I can reproduce the error. https://savannah.gnu.org/bugs/index.php?32191

>>> Also parallel seems to report some some jobs
>>> as done although they are not done. It might be caues
>>> by the previsous error.
>>
>> Please show an example that shows this behaviour.
>
> Maybe I was just confused that parallel shows the jobs as done
> but the job actually failed. So in some sense it was "done".

If a job return error code <> 0 more than --retries times then it is
considered done.

>>> One more comment - if I kill parallel, programs at computers
>>> where it was spreading overs ssh keep running. Which
>>> might cause troubles if you need to restart computation
>>> as the computers you want to use are already computing
>>> the previous, but now killed, computation.
>>
>> Please show an example that shows this behaviour.
>
> The provided example is exhibiting the behaviour. Only
> with sleep 10 it is not as obvious. But with sleep 1000
> the process at target machines are just waiting for the sleep to finish.
> Although the main command is closed.

I can reproduce this, too: https://savannah.gnu.org/bugs/index.php?32193

The workaround for this is to kill GNU Parallel with CTRL-C.


/Ole



reply via email to

[Prev in Thread] Current Thread [Next in Thread]