emacs-bug-tracker
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#61646: closed (Bandwidth-induced offload timeout abort whole operati


From: GNU bug Tracking System
Subject: bug#61646: closed (Bandwidth-induced offload timeout abort whole operating)
Date: Sat, 25 Feb 2023 03:08:02 +0000

Your message dated Fri, 24 Feb 2023 22:07:43 -0500
with message-id <87y1omfnnk.fsf@gmail.com>
and subject line Re: bug#61646: Bandwidth-induced offload timeout abort whole 
operating
has caused the debbugs.gnu.org bug report #61646,
regarding Bandwidth-induced offload timeout abort whole operating
to be marked as done.

(If you believe you have received this mail in error, please contact
help-debbugs@gnu.org.)


-- 
61646: https://debbugs.gnu.org/cgi/bugreport.cgi?bug=61646
GNU Bug Tracking System
Contact help-debbugs@gnu.org with problems
--- Begin Message --- Subject: Bandwidth-induced offload timeout abort whole operating Date: Sun, 19 Feb 2023 22:28:16 -0500
Hi Guix,

I can reproduce this rather easily on my system:

--8<---------------cut here---------------start------------->8---
$ ./pre-inst-env guix build icedove
The following derivations will be built:
  /gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv
  /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv
  /gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv
process 19542 acquired build slot '/var/guix/offload/localhost:6666/0'
normalized load on machine 'localhost' is 0.08
building /gnu/store/8zi808086b3vlfjrhdm87fgljziwdqx2-icedove-l10n-102.7.2.drv...
process 19548 acquired build slot '/var/guix/offload/localhost:6666/1'
normalized load on machine 'localhost' is 0.08
building 
/gnu/store/v0sq7rb8fk36kjasb27a71z1a27wxb1s-icedove-minimal-102.7.2.drv...
guix offload: sending 1 store item (558 MiB) to 'localhost'...
exporting path 
`/gnu/store/bwb5hcdyzgq16kmbsva7ax0zq6lzg78z-icedove-102.7.2.tar.xz'
guix offload: error: failed to connect to 'localhost': Timeout connecting to 
localhost
cannot build derivation 
`/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv': 1 
dependencies couldn't be built
guix build: error: build of
  `/gnu/store/l6r93asndd0kwv7024iyrl71zd0lbpbq-icedove-102.7.2.drv' failed
--8<---------------cut here---------------end--------------->8---

The third derivation tries to get a build slot and times out, because
the first two have already saturated the bandwidth of the link and it
takes more time than expected to get a reply.

The workaround is to use '-k', for "--keep-continuing", and retry the
3rd failing derivation after the first two completed.

I don't have a clear idea on how to improve the situation other than use
longer timeouts... but perhaps these timeouts could be dynamic based on
the load of the network/CPU/ ?

-- 
Thanks,
Maxim



--- End Message ---
--- Begin Message --- Subject: Re: bug#61646: Bandwidth-induced offload timeout abort whole operating Date: Fri, 24 Feb 2023 22:07:43 -0500 User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Hello,

Ludovic Courtès <ludo@gnu.org> writes:

[...]

> Weird.  Since the it’s a timeout while connecting, I suppose the patch
> below would improve the situation:
>
> diff --git a/guix/scripts/offload.scm b/guix/scripts/offload.scm
> index 578b3b9888..90cf97401c 100644
> --- a/guix/scripts/offload.scm
> +++ b/guix/scripts/offload.scm
> @@ -220,7 +220,7 @@ (define* (open-ssh-session machine #:optional 
> max-silent-time)
>          (session (make-session #:user (build-machine-user machine)
>                                 #:host (build-machine-name machine)
>                                 #:port (build-machine-port machine)
> -                               #:timeout 10       ;initial timeout (seconds)
> +                               #:timeout 30       ;initial timeout (seconds)
>                                 ;; #:log-verbosity 'protocol
>                                 #:identity (build-machine-private-key machine)

Nevermind my previous message, it was --sysconfdir that had not been
set, thus ignoring my offload setup (/etc/guix/machines.scm).  The
command worked to test the change from the local machine:

--8<---------------cut here---------------start------------->8---
sudo -E ./pre-inst-env ./guix-daemon --build-users-group guixbuild \
 --max-silent-time 0 --timeout 0 --log-compression none --discover=yes \
 --substitute-urls "https://ci.guix.gnu.org https://bordeaux.guix.gnu.org"; \
 --max-jobs=4
--8<---------------cut here---------------end--------------->8---

I pushed the fix in commit 53d718f61b.

Closing, thank you!

-- 
Thanks,
Maxim


--- End Message ---

reply via email to

[Prev in Thread] Current Thread [Next in Thread]