help-guix
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-privileged daemons and offloading


From: Ben Woodcroft
Subject: Re: Non-privileged daemons and offloading
Date: Sat, 30 Jul 2016 14:11:39 +1000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0

Hi,

Thanks all for the informative responses.

On 20/06/16 18:05, Ludovic Courtès wrote:
Hello!

What you describe here is a hot topic and definitely a commonly
requested feature.  The difficulty here is that we’re hitting
limitations of the kernel, which requires root privileges to set up a
chroot and so on.

The way around it is Linux’ unprivileged “user namespaces”, as used by
‘guix environment --container’: they allow users to set up isolated
environments similar to what guix-daemon does, but without being root.
Unfortunately, this feature is disabled on some distros out of security
concerns (user namespaces are young and have a relatively bad track
record.)

You can check whether a system supports it like this:

   if [ -f /proc/self/ns/user ]
   then
       if [ -f /proc/sys/kernel/unprivileged_userns_clone ]
       then
           if [ `cat /proc/sys/kernel/unprivileged_userns_clone` -ne 0 ]
           then
               echo "unprivileged user namespaces supported"
           fi
       else
           echo "unprivileged user namespaces supported"
       fi
   fi

I'm afraid I didn't have much luck with this:
$ uname -a
Linux euramoo3.qld.nectar.org.au 2.6.32-642.3.1.el6.x86_64 #1 SMP Tue Jul 12 18:30:56 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

They are running CentOS 6.6 because they use Rocks:
http://www.rocksclusters.org/

Regardless, it remains our best hope to support unprivileged daemons.

It would be nice to get stats on typical HPC systems.

Hardware stats can be seen here, if that is what you mean?
https://rcc.uq.edu.au/euramoo

One difficulty is that there is 3 head nodes, and the daemon can only be run on one of them. For my personal situation this isn't a big difficulty since we can just designate one these as the "guix" node and always login to that one.

Roel has been looking into these issues recently, so perhaps he has some
ideas.  The Nix daemon recently switch to user namespaces:

   https://github.com/NixOS/nix/commit/c68e5913c71badc89ff346d1c6948517ba720c93

We could backport this.  However, running builds with UID 0 is
potentially disruptive: some packages are sensitive to this and behave
differently under UID 0 (I remember Coreutils’ test suite does.)  Also,
this patch switches to user namespaces, but not specifically
_unprivileged_ user namespaces.

Using offloading as you suggest doesn’t help: you would still need a
daemon with access to /gnu/store.

I'm working with a non-standard store location, so that I do have access. This means compiling everything from scratch, but then I find watching "guix build --max-jobs" very beautiful. Building also doesn't take too long after the first few packages are built, particularly in comparison to downloading and building them manually.

However, given Roel's tried and tested method I don't think I'll pursue this approach.

(Thinking out loud.)

There’s a fun hack mind that could kinda work provided you use only
substitutes, where you wouldn’t even need a daemon:

   1. Compute the derivation of the package you want; normally that
      requires a daemon to which we make ‘add-to-store’ RPCs, but we
      should be able to fake them altogether;

   2. Use (guix scripts substitute) to download a substitute for that
      package, and unpack it under ~/.local, say;

   3. Use ‘call-with-container’ (thus, unprivileged user namespace) to
      put yourself in an environment where /gnu/store/foo inside is a
      bind-mount to ~/.local/gnu/store/foo outside.

There would remain the problem of profiles and grafts, which are normal
derivations.

When you think about it, it amounts to reimplementing (part of) the
daemon functionality as a library, which is probably the way to go.
That is, we could implement ‘add-to-store’ and ‘build-derivations’ such
that they would operate locally under ~/.local.  As a first milestone,
‘build-derivations’ could fail unless there’s a substitute available.

Food for thought!

Indeed, clever.


On 20/06/16 22:23, Ludovic Courtès wrote:
address@hidden (Ludovic Courtès) skribis:

Regardless, it remains our best hope to support unprivileged daemons.
Also, I did not explicitly mention it, but I think this unprivileged
user namespace thing should just be one part of the strategy.

I agree supporting non-privileged, non-container setups would be good. OTOH, I cannot think of a way to do this that doesn't require a separate build machine or a way that supports "guix environment".

In parallel, it’s worth discussing with cluster sysadmins and see if
they can have guix-daemon running on the cluster.  There are good
reasons for them to do that compared to letting each user do their own
thing, and one of them is improved resource usage.Ricardo outlined the setup he 
came up with on a cluster here:

   http://elephly.net/posts/2015-04-17-gnu-guix.html

and we have a bunch of arguments in store ;-):

   https://hal.inria.fr/hal-01161771/en

In the end, for a sysadmin, it’s a cost/benefit tradeoff.  In some
situations, providing Guix may mean much less work for cluster admins.

I haven't had much luck with this so far, though there is some interest. I'm hoping to convince the sysadmins that Guix is mature enough software by running my own setup for a while. The difference between the number of bioinformatics packages available in Guix versus the number of available modules in the HPC reflects very well on us here - that would be a primary advantage. On a separate cluster where I do have root, we've also had good success in adapting profiles to sit behind modulefiles so that users are unaware that behind the scenes some packages have been transitioned to being built by Guix, especially as we upgrade the OS.


On 20/06/16 19:06, Roel Janssen wrote:
Hello Ben,

It seems like we are facing a similar problem.  A proper solution takes
a lot more work and a lot more time I believe.  I am also currently
working on a more complete guide to do this, but here I tried to get the
essentials written down.

As far as software deployment goes, I have done the following to get it
on the restricted environment (in my case a cluster, on your case, a
super computer):

Actually I misspoke, it is a cluster. I followed your instructions and it is working well, thanks! I would be happy to contribute to a guide on doing this if that is of use.

1. Get /gnu/store, or bootstrap your own store with a custom prefix
(I've done the latter) on a VM or a machine that has super user
privileges (let's call this the "build host").

For a custom prefix, you need to build guix from source with:

   ./configure --with-store-dir=/hpc/custom/guix-store \
               --localstatedir=/hpc/custom/guix-state/guix

You should change the environment variables: NIX_STATE_DIR and
NIX_STORE_DIR, before running the daemon, and before running the guix
command as a user.  In my case, I used:

   export NIX_STATE_DIR=/hpc/custom/guix-state/guix
   export NIX_STORE_DIR=/hpc/custom/guix-store
   guix-daemon --cores=4 --max-jobs=4 --no-substitutes 
--build-users-group=guixbuild

I didn't find exporting those environment variables necessary, except when running guix and guix-daemon on the restricted machine.

2. Build the packages you want to deploy on the HPC on the build host.

   export NIX_STATE_DIR=/hpc/custom/guix-state/guix
   export NIX_STORE_DIR=/hpc/custom/guix-store
   guix package -i <anything-you-need>


3. Copy the store and profiles.  This is a bit more tricky.  In my case,
hardlinks would not work because of the properties of our storage
system.  I used the following to copy the store and the profile (and
update it later on):

   rsync -lrt --delete --exclude=.links /hpc/custom/guix-store 
address@hidden:/hpc/custom
   rsync -lrt --delete --exclude=.links /hpc/custom/guix-state 
address@hidden:/hpc/custom

I simplified this by putting all the things to rsync in a single folder so that a single call was needed, and I found "-z" helped too.

I excluded the .links directory to save space (you could copy them as
normal files instead of hardlinks, and the size of your store will
double).  Without this directory, you cannot efficiently do package
management, so don't remove it on the build host.

I didn't use the offloading mechanism on Guix.  I avoid using the
guix-daemon entirely, and reduce the deployment problem to an rsyncable
thing.

 From here on, you can run programs as usual by adding
/hpc/custom/guix-state/guix/profiles/per-user/<username>/guix-profile/bin
to your path (and the other relevant environment variables).

If you install guix in your store, you can run guix-daemon on the
restricted machine and get 'guix package --search-paths', 'guix graph
...' and even 'guix gc' to work.  I haven't tested the other commands
yet.

I take it by 'guix gc' you mean 'guix gc --references' since you don't want to modify the store, correct? I only tried 'guix package --search-paths', which is probably the most important for users.

Thanks,
ben



reply via email to

[Prev in Thread] Current Thread [Next in Thread]