bug-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#71238: Installer image consistently fails to run system init due to


From: adanskana
Subject: bug#71238: Installer image consistently fails to run system init due to TLS error
Date: Mon, 10 Jun 2024 05:33:50 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.0

Hi all,
On 29/05/2024 01:44, Richard Sent <richard@freakingpenguin.com> wrote:
Richard Sent <richard@freakingpenguin.com> writes:

> 1. There was a transient network issue for ~3 hours when I attempted to
> install Guix ~4 times using different installation media that caused a
> specific TLS handshake to fail.
>
> 2. A specific TLS handshake Guix undertakes during the installation
> process fails to pass one of the built-in firewall rules shipped with
> opnsense.
>
> 3. Some other odd aspect of my network messes things up for a specific
> TLS handshake.
>
> My money is on 2 given how this is a seemingly common issue on
> enterprise networks [1] and the rules I have added seem irrelevant. (I'd
> rather not talk openly about my firewall rules in an archived public
> forum, but can discuss off-list). However, there is another comment in
> that thread that says IT didn't notice any firewall blocking.

I ran the 1.4.0 installer again today behind my opnsense router and it
completed successfully, which is horrifying. I was hoping starting from
a constant image would make the error reproducible but that doesn't seem
to be the case. Even with a consistent system image and network, it's
only reproducible for somewhere between a few hours and one day. Perhaps
server load plays a part?

(Technically my process was a little bit different. Instead of fully
completing the graphical installer I swapped to a TTY after activating
the wired connection, mounted the root fs, and run $ guix system build
/mnt/etc/config.scm, where config.scm was unmodified since initial
installation. I'd be stunned if this caused the change in behavior but
figured I'd mention for completeness.)



I've mananged to reproduce this bug. First, I run `sudo guix gc delete-generations && 
guix gc -d 2w` to clear my store. Then I run `guix upgrade && sudo guix system -L 
/home/ada/dotfiles/guix/ reconfigure --fallback 
/home/ada/dotfiles/guix/ada/system/kissakoira.scm` to redownload all of those deleted store 
items. The process 9/10 will fail halfway through the upgrade process. Then, a retry will work 
without a hitch. Even re-gc-ing my system will not let me reproduce the bug - I need to restart 
my system. Then, the likelyhood it works is 7/10 until the next day (just my perception). By 
the way, this is on my university's network.

I managed to capture the problem happening under strace using this command `strace -ff -tt -o log_up.strace -s 500 guix upgrade && sudo strace -ff -tt -o log_sr.strace -s 500 sudo guix system -L /home/ada/dotfiles/guix/ reconfigure --fallback /home/ada/dotfiles/guix/ada/system/kissakoira.scm`. I've uploaded the logs to my Google Drive[1]. You can use `strace-log-merge log_up.strace` to view to merged logs.
As I can reproduce this error fairly consistently now, please let me know if 
you want me to run any more tools to capture more data.

Warmly,
Ada

[1] 
https://drive.google.com/file/d/104DVqyMLGRi4imWzvFQ6TahAiRRKdR4_/view?usp=drive_link





reply via email to

[Prev in Thread] Current Thread [Next in Thread]