guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Slurm with containers (i.e., orchestration)


From: Pjotr Prins
Subject: Slurm with containers (i.e., orchestration)
Date: Mon, 18 May 2020 07:49:00 -0500
User-agent: NeoMutt/20170113 (1.7.2)

I am looking into some light-weight style orchestration. One
possibility is to use Slurm with Guix containers - on a cluster with
Guix that is almost trivial (we use Guix containers a lot! They are
great) and would also allow non-container jobs.

Once we have containers and Slurm it should also be possible to deploy
in some cloud infrastructure, provided there are no dependencies on
the cluster itself. I think it would make a terrific BLOG story if we
put something like that together. 

Bcbio describes an architecture that uses the common workflow language
(CWL) to run pipelines with containers

  
https://bcbio-nextgen.readthedocs.io/en/latest/contents/cwl.html#running-with-cromwell-local-hpc

I am not promoting the use of this, but it shows that infrastructure
exists that can deploy workflows on containers in different setups
(Bcbio supports Slurm). I know the Guix infrastructure uses Guix
deploy to achieve similar roll-outs. What that lacks is the
orchestration mechanism itself which should handle dependencies
between jobs (i.e. a workflow). The GNU Workflow Language goes some
way, but it does not handle orchestration itself.

In other words, we almost have the pieces, but one thing is missing
:). Thoughts? I know I have brought this up before in different
guises, but we start to really need something here.

What makes orchestration? I guess it concerns a dynamic database of
machines that can execute jobs and some type of software registry
(Guix).  Next it should be able to schedule and execute jobs using
some constraint specifiers (like network/CPU/RAM). It could be a
'dynamic' Slurm that makes use of real machines and VMs. Or hook into
an existing cloud service. A slurm job could monitor sending a
container into a cloud service. 

I think we can build this up a step at a time. 

Thoughts?

Pj.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]