guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Presentation BlueHats (french workshop)


From: zimoun
Subject: Presentation BlueHats (french workshop)
Date: Fri, 20 Dec 2019 22:11:28 +0100

Dear Guix,

Attached 2 patches for the repo 'maintenance'.
 1. Fixing broken links in talks/
 2. My slides

Firstly, it is a copy/paste of the file talks/bluehats-2019/outline.
Feel free to comment. And if I am able to manage some time in the 2
next weeks, I will try to expand the use cases as a blog post entry
and/or Cookbook recipe.

Secondly, after the blabla, some information about the missing files
is provided.


If these 2 patches make sense, feel free to push them. :-)

All the best,
simon

   - - -

This talk was in French with a slot of 5-7 minutes, questions included.  It was
taken in a full day satellite to Paris Open Source Summit.  The initiative was
lead by Bastien Guerry from https://www.etalab.gouv.fr/.  More information of
the programme 
[[https://forum.etalab.gouv.fr/t/journee-bluehats-lors-du-paris-open-source-summit-le-11-decembre-2019/4614][here]].

The slot was very short and the audience very heterogeneous; especially about
the day-to-day concerns.  As an engineer working in an institute doing research
in biology, I have tried to explain what is the Reproducible Science challenge
in the modern age of data.

In short, today a scientific result is an experiment producing data *and* a
numerical processing.  From what I am seeing, the experimental part is more or
less well described, or let say that people in labs are aware of its importance
because they have already several decades (even more) of collective learning.

However, not enough people take care about the numerical processing.  Mainly, in
my opinion, because we are living a scientific paradigm shift.  From what I am
seeing, more than often, it is not understood that more scientific value is in
the numerical process than really in the data itself (or how they are produced).
Even if I am fully biased because computing is my job and I understand nothing
about labs.

To guarantee Reproducible Science in the modern age of data, we need to
guarantee several items, especially:
 1. Open Articles
 2. Open Data
 3. Open Source
 4. Controlled computing environment (open, too)
Today, initiatives have been starting, to name some, about 1.
[[http://rescience.github.io/][ReScience journal]]
or french specific [[https://hal.archives-ouvertes.fr/][HAL]], 2.
[[https://zenodo.org/][Zenodo]] and 3.
[[https://www.softwareheritage.org/][Software Heritage]].

However, what about the point 4.?

To fix the ideas, let consider some examples I encounter everyday.
  + Alice use the tool foo-1.2, bar-3.4 and baz-5.6
  + Carole works with Alice but works for another project with the tools foo-7.8
    and bar-9.0
  + Charlie upgrades their system and then nothing is working
  + Bob uses the same versions than Alice but he hits different results
  + Dan wants to replay the same numerical processing several months (or years)
    later but he is not able to reinstall the same versions of the tools because
    the tools have been updated breaking the backward compatibility.
With these scenarii, the idea is to spot concrete issues of the daily life of
researchers.

Each issue is fixable separately:
 * package managers fix the dependency hell
 * virtual environments fix the coexistence of several versions
 * containers fix the exact same version (and the coexistence).
But now the nightmare is to work with all these layers.  Wait, Guix already
provides all we need.

Guix allows to control with a fine grain the toolchain and this control is the
masterpiece of Reproducible Science.  At in least in my opinion.

The two keys are the binary transparency which allows to track what should be
wrong and the bootstrapping which is the root ingredient of the former.

Then, it is how Guix works, firstly as an end-user for each scenario and
secondly some plumbing presented in length elsewhere (FOSDEM, etc.)

  - - -


1. missing 2 files
2. svg -> pdf: parameters?

1.
talks/fosdem-2017/hpc/images/shrink-wrap.jpg
talks/fosdem-2017/hpc/images/shrink-wrap2.png


2.
in2p3-2019/images/reproducible-builds.pdf ->
../../fosdem-2019/distributions/images/reproducible-builds.pdf
but ./fosdem-2019/distributions/images/reproducible-builds.svg

./in2p3-2019/images/bootstrappable.pdf ->
../../fosdem-2019/distributions/images/bootstrappable.pdf
but ./fosdem-2019/distributions/images/bootstrappable.svg


To find more easily the offending files. :-)

--8<---------------cut here---------------start------------->8---
for f in `find . -type l ! -exec test -e {} \; -print` ; do ls -l $f ; done

for f in `find . -type l ! -exec test -e {} \; -print`
do
    for ext in .pdf .png .jpg
    do
        find . -type f -name "*$(basename $f $ext)*"
    done
done
--8<---------------cut here---------------end--------------->8---

Attachment: 0001-talks-Fix-broken-links-between-files.patch
Description: Text Data

Attachment: 0002-talks-Add-BlueHats-2019-talk.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]