Reproduce Docker images produced by Guix

shell the smoothie

Docker images are smoothie, right? They lack transparency and it is hard nor impossible to know what is strawberry or whale oil, right? Although containers are efficient way to ship things, the core question is how these things are produced.

The aim of this post is to demonstrate that the issue is not Docker images by themselves, instead the concrete question when speaking about reproducibility, is: from where the binaries come and using which tool for supplying?

This scenario had been initially written as comment when reviewing patch#45919.

Alice generates

Alice is working on a standard scientific stack using Python. Therefore, she stores along her project the files manifest.scm containing the package set and channels.scm containing the state of Guix (other said the version). Owning these two files allows to replay using guix time-machine the exact same computational environment.

Concretely, manifest.scm reads,


and guix describe -f channels returns,

(list (channel
        (name 'guix)
        (url "")
              "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA")))))

So far, so good. Because Alice needs to run this stack on some infrastructure not running Guix but instead running Docker, she just pack her scientific stack with something along this line,

$ guix pack -f docker --save-provenance -m manifest.scm

The next step might depend. One solution is to locally load the generated tarball using Docker tools, something along this line,

$ docker load < /gnu/store/6rga6pz60di21mn37y5v3lvrwxfvzcz9-python-python-numpy-docker-pack.tar.gz
Loaded image: python-python-numpy:latest
$ docker images
REPOSITORY                                TAG          IMAGE ID       CREATED         SIZE
python-python-numpy                       latest       ea2d5e62b2d2   51 years ago    431MB

then docker push to a convenient registry. The second solution is to transfer the previous tarball as any other data to the other infrastructure and run overthere the previous Docker commands.

For the sake on the demonstration, on the other machine, it just works:

$ docker run -ti python-python-numpy:latest python3
Python 3.8.2 (default, Jan  1 1970, 00:00:01)
[GCC 7.5.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
import numpy as np
>>> A = np.array([[1,0,1],[0,1,0],[0,0,1]])
A = np.array([[1,0,1],[0,1,0],[0,0,1]])
>>> _, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
_, s, _ = np.linalg.svd(A); s; abs(s[0] - 1./s[2])
array([1.61803399, 1.        , 0.61803399])
>>> quit()


On a side note, the Docker image is directly produced by Guix. Other said, Guix manages everything, from the binary packages and all the requirements to the Docker image itself – no Dockerfile involved. In other words, this Docker image is just a container format among many others, for instance guix pack -f squashfs --save-provenance -m manifest.scm will generate a Singularity image (other container format) with the exact same binaries inside.

Bob redo later and elsewhere

Bob works with the Alice’s Docker image. He needs to run this exact same versions on another infrastructure using plain relocatable tarballs, for example. Or he needs to scrutinize how all the binaries in this stack are produced, because maybe he found a bug and want to know if all the results obtained with this Docker image are correct or not, or maybe he wants to study a specific aspect to better understand a specific result. Well, Bob is doing Science and thus Bob needs transparency.

The files manifest.scm and channels.scm sadly disappeared long time ago. Probably at the end the Alice’s postdoc. If the Docker image had been produced with Dockerfile, then game over!

Hopefully, Bob remembers this Docker image had been produced with Guix (pack --save-provenace). Let get the recipe of the smoothie.

Here the tricks! First, let start the container which eases the export to plain tarball. Second, let extract the embedded Guix profile.

$ docker run -d python-python-numpy:latest python3
$ docker export -o /tmp/re-pack.tar $(docker ps -a --format "{{.ID}}"| head -n1)

$ tar -xf /tmp/re-pack.tar $(tar -tf /tmp/re-pack.tar | grep 'profile/manifest')
$ tree gnu
└── store
    └── ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile
        └── manifest

2 directories, 1 file

Wow! Is it really a regular profile? Yes, it is!

$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-channels
;; This channel file can be passed to 'guix pull -C' or to
;; 'guix time-machine -C' to obtain the Guix revision that was
;; used to populate this profile.

       (name 'guix)
       (url "")
             "BBB0 2DDF 2CEA F6A8 0D1D  E643 A2A0 6DF2 A33A 54FA"))))

$ guix package -p gnu/store/ia1sxr3qf3w9dj7y48rwvwyx289vpfgi-profile --export-manifest
;; This "manifest" file can be passed to 'guix package -m' to reproduce
;; the content of your profile.  This is "symbolic": it only specifies
;; package names.  To reproduce the exact same profile, you also need to
;; capture the channels being used, as returned by "guix describe".
;; See the "Replicating Guix" section in the manual.

  (list "python" "python-numpy"))

Awesome, isn’t it? These two last outputs are equivalent to the Alice’s manifest.scm and channels.scm ones. Other said, let run whenever and wherever1 this,

guix time-machine -C new-channels.scm \
     -- pack -f docker --save-provenance -m new-manifest.scm

should produce the exact same docker-pack.tar as previously. If not, raise your hand and open a bug.

Join the fun, join GNU Guix!



wherever, Guix installed at least. :-)

© 2014-2024 Simon Tournier <simon (at) >

(last update: 2024-06-05 Wed 09:47)