Rewriting Guix fixed-output past derivation

another attempt to fix the past for a better future

Discussing a proposal for ACM Rep'24 with Ludo, Timothy and Stefano, an idea popped up: Could the Guix daemon catch a fixed-output derivation from the past and instead turn it to a recent one before building it? Here, my very first attempt: collecting material for evaluating the feasibility of such idea.

If you are not familiar with Guix or with scientific practise, you might ask: what’s the point?

My main interest comes from Reproducible Research.

Concretely, the context is to be able to manipulate any computational environment, even one from the past. That’s what does most scientific practitioners: challenge the current knowledge. What do we challenge if we are not able to manipulate how the current knowledge had been built?

Long story short. Now, it’s easier to rebuild the past: Guix provides machinery that paves the way. However, the first roadblock is about the ability to fetch source code. The link rot is not something to take light; it is a real concern: ~10% of source code disappears before 5 years – estimation from Guix packages. For sure, Software Heritage archive helps a lot! However, fetching source code is not straightforward: systems evolve, bugs are fixed, strategies are improved, etc. For instance, using guix time-machine, we are able to reach software that had been packaged before the introduction of bridges (Disarchive) between Guix and Software Heritage.

The question is thus: how to download now all the source code required by the past?

Note that all might be many. A computational environment inside which we run few packages might require hundred or thousand other packages under the hood. Therefore, it means hundred or thousand source code, too, more or less. Obviously, it is not affordable to manage all by hand.

An example: Redoing our paper in Nature Scientific Data (Oct. 2022). In the paper Toward practical transparent verifiable and long-term reproducible research using Guix, we explain that the command-line,

guix time-machine -C channels.scm \
     -- environment -C -L my-pkgs -m manifest.scm

builds the exact same computational environment as it was in April 2020. This command-line was just working out-of-the box in October 2022. Now, almost one year and half later, this very exact same command-line fails.

Wait, this other blog post explains how to fix the issue, no? Why more? Because the bug with Bioconductor is one example of some bug, therefore:

  1. If we already know them, then we could fix them for everybody and not expect such manifest.
  2. Guix procedures falling back to Software Heritage contained a bug – it was not following redirects in the Software Heritage Vault – that makes now this mechanism from the past broken.
  3. If the source code needs Git for fetching content, then it requires the Git binary from the past. Other said, the command-line above will download binary substitutes at best, or will start to build all the stack until the past Git.

Still there? Let dive into details: explain the easy part and expose the technical challenges.

Guix fixed-output derivation

First, what is a derivation? The Guix manual explains:

Low-level build actions and the environment in which they are performed are represented by “derivations”.

You know, that’s the .drv files. Using Guix 929ddec, here an example:

$ guix build hello -d
/gnu/store/qr00sgbh3vwwqswmgjjymg6wkys9r4i2-hello-2.12.1.drv

Ok, and what do I do with that? What does it contain?

Displaying derivation with guix drv-show extension

The adventurous reader will run cat $(guix build hello -d) and will see something hard to parse by human. Obviously Emacs knows how to make it human-readable. How to display a derivation for non-Emacs users? Let write a Guix extension!

Again Guix manual explains: A derivation contains the following pieces of information:

  • The outputs of the derivation — derivations produce at least one file or directory in the store, but may produce more.
  • The inputs of the derivation — i.e., its build-time dependencies—which may be other derivations or plain files in the store (patches, build scripts, etc.).
  • The system type targeted by the derivation — e.g., x86_64-linux.
  • The file name of a builder script in the store, along with the arguments to be passed.
  • A list of environment variables to be defined.

Other said, it means something like:

$ guix drv-show $(guix build hello -d)
name: /gnu/store/qr00sgbh3vwwqswmgjjymg6wkys9r4i2-hello-2.12.1.drv
outputs:
+ /gnu/store/6fbh8phmp3izay6c0dpggpxhcjn4xlm5-hello-2.12.1   [out]
inputs:
+ /gnu/store/3ds56xg6njpw6hnp2w4xpx4psw5mka5q-glibc-2.35.drv                [out]
+ /gnu/store/3zh2qpi897s2x229s93iakji86b08a20-hello-2.12.1.drv              [out]
+ /gnu/store/5bqhdbbl71r9r936w6w8zzqlk41md3wx-glibc-2.35.drv                [out]
+ /gnu/store/67nh3fzviy3q4s8ar8cg0dzhyzgwrwdd-module-import-compiled.drv    [out]
+ /gnu/store/7fsz44vifdc0ws0amnpwnmig3ra6hb53-gcc-11.3.0.drv                [lib]
+ /gnu/store/fchdaawcrxb35llbl7fj7lcsq5asmk4b-guile-2.0.14.drv              [out]
+ /gnu/store/ky030dkfkfr3l8xgdbv45j6bs87988lx-gcc-11.3.0.drv                [lib]
+ /gnu/store/n9kblf5cx4lphrydjr90sp3zfvcdr1pb-glibc-utf8-locales-2.35.drv   [out]
system: x86_64-linux
builder:
+ /gnu/store/4p1l5bdxxbyyqc3wh0d07jv9rp1pdcy7-guile-2.0.14/bin/guile
+ --no-auto-compile
+ -L /gnu/store/a6acf6dds8s9fw7dp5div03rwik0x4x2-module-import
+ -C /gnu/store/yk897hj2p5mdx6hw47s90n8x9pn6s36c-module-import-compiled
+ /gnu/store/fiy8arwqm8vwaqs4h8b361kbmjmd1yra-hello-2.12.1-builder
environment:
+ allowSubstitutes: 0
+ guix properties: ((type . graft) (graft (count . 2)))
+ out: /gnu/store/6fbh8phmp3izay6c0dpggpxhcjn4xlm5-hello-2.12.1
+ preferLocalBuild: 1

For the implementation details of the Guix extension, see here.

I hope all is self-explanatory. The keys are:

  1. A derivation specifies where the result will be (outputs)
  2. A derivation specifies other derivations required by the builder (inputs)
  3. A derivation specifies scripts how to build (builder)

As you imagine, point 2. implies a directed acyclic graph. What are the roots of such graph? The roots are derivations that represent how to fetch source code. Other said, fixed-output derivation.

Fixed-output derivation

Once again, Guix manual explains:

Operations such as file downloads and version-control checkouts for which the expected content hash is known in advance are modeled as fixed-output derivations. Unlike regular derivations, the outputs of a fixed-output derivation are independent of its inputs—e.g., a source code download produces the same result regardless of the download method and tools being used.

Back to hello example, attentive reader notices that this derivation is about grafts. Compare,

$ guix build hello -d --no-grafts
/gnu/store/3zh2qpi897s2x229s93iakji86b08a20-hello-2.12.1.drv

with the inputs above. See? Let give a look to the inputs of this derivation:

$ guix drv-show $(guix build hello -d --no-grafts) | recsel -P inputs

/gnu/store/1lr41kwfsss66h37lchbk7icz74yj0p1-binutils-2.38.drv                 [out]
/gnu/store/23201bjiyxi0q3fkvd84kc62v1rl0pp0-xz-5.2.8.drv                      [out]
/gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv           [out]
/gnu/store/34i09xrz49phnkij2c8k6ps37na6cr74-make-4.3.drv                      [out]
/gnu/store/3ds56xg6njpw6hnp2w4xpx4psw5mka5q-glibc-2.35.drv                    [out,static]
/gnu/store/3nk2iml8bzn9bx1cfkjkc0625mrdx7vf-patch-2.7.6.drv                   [out]
/gnu/store/4220x2mav9gr6m2hvnnz6fyvgdin5hjq-linux-libre-headers-5.15.49.drv   [out]
/gnu/store/4kjp42vhi9j4snrm4w7liqwjf1260cma-gzip-1.12.drv                     [out]
/gnu/store/50a44myd0qvqf9v2rpwqckipsrlmn9ag-diffutils-3.8.drv                 [out]
/gnu/store/7fsz44vifdc0ws0amnpwnmig3ra6hb53-gcc-11.3.0.drv                    [out]
/gnu/store/82f7nr1h69vb3mz34sjdzqlbi30iw56y-grep-3.8.drv                      [out]
/gnu/store/83rd4k9ddh4ka89mk16ffnkchd688gla-module-import-compiled.drv        [out]
/gnu/store/9dpbf0fix5n3gwrh0656k0yzcm0v2lys-guile-3.0.9.drv                   [out]
/gnu/store/fc83kzqb5ry2f6bdlch1xhp3fkpiwc9p-bzip2-1.0.8.drv                   [out]
/gnu/store/gabi1mjsw981bpyj636flilr5pxqi78c-file-5.44.drv                     [out]
/gnu/store/h1ybqwg18by6qmyp4m4dbz9hlcpg2553-tar-1.34.drv                      [out]
/gnu/store/ivfqnmrbn3z9n0lhyrffqmzwqa65q295-coreutils-9.1.drv                 [out]
/gnu/store/n9kblf5cx4lphrydjr90sp3zfvcdr1pb-glibc-utf8-locales-2.35.drv       [out]
/gnu/store/p9igblxhxmkl97ffld4y82z0p7v9ajd2-sed-4.8.drv                       [out]
/gnu/store/pwzwbc49435kq89jb3mfk09j4dn1b4rw-ld-wrapper-0.drv                  [out]
/gnu/store/r1sjrj2z2hn1s1ab7iqydk4vhc655gv8-bash-minimal-5.1.16.drv           [out]
/gnu/store/r4r2cvimqzwdl5n523zfgjq23c7pm3bj-findutils-4.9.0.drv               [out]
/gnu/store/rpxwpdlndvx41ia8a4zqpjssvpzchaj4-gawk-5.2.1.drv                    [out]

These inputs are all the requirements to build the very simple package hello. Do you see the one ending by .tar.gz.drv? That’s the fixed-output derivation. You also might find it with guix build hello -d -S.

$ guix drv-show /gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv
name: /gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv
outputs:
+ /gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-hello-2.12.1.tar.gz   [out]
hash: 086vqwk2wl8zfs47sq2xpjc9k066ilmb8z6dn0q6ymwjzlm196cd
inputs:
system: x86_64-linux
builder:
+ builtin:download
environment:
+ content-addressed-mirrors: /gnu/store/wg1yp2vx8gb7qmcgyibqnwblahpp4bjg-content-addressed-mirrors
+ disarchive-mirrors: /gnu/store/0mxnx8l4fgigvd7gakwdk6hc6im4wnai-disarchive-mirrors
+ impureEnvVars: http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS
+ mirrors: /gnu/store/0sqi3rs694q1v36v5vdxqhjbrlip5vvn-mirrors
+ out: /gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-hello-2.12.1.tar.gz
+ preferLocalBuild: 1
+ url: "mirror://gnu/hello/hello-2.12.1.tar.gz"

Here, what matters – a lot! – is part of the hash: the content of outputs is know beforehand. Other said, guix hash $(guix build hello -S) returns1 this checksum.

The hash 6fbh8phmp3izay6c0dpggpxhcjn4xlm5 as in the path /gnu/store/6fbh8phmp3izay6c0dpggpxhcjn4xlm5-hello-2.12.1 depends on this fixed initial hash. If one modifies this fixed hash, then all the derivations using it as inputs will get another hash path, and recursively.

The output path of a fixed-output derivation depends only on the hash known beforehand. It seems a well-chosen name after all: the output path is fixed by the easily verifiable hash.

To make it clear about fixed-output derivations:

  1. Two different derivations might return the same output.
  2. The output only depends on the hash; and not on inputs, builder or environment.

Builtin builder vs other

Well, I hope you still follow. Maybe the question you have is about the builder. We said that the builder is a script using inputs, right? In the example above there is no inputs and just builtin:download. What does it mean?

It means that all is managed daemon side. Here we loose transparency and verifiable computations because two people running this exact same derivation might or not run the exact same binaries; it depends on their current Guix revision.

The reason of this builtin builder for downloading content is to cut a dependency cycle. There is a chicken-or-the-eggs problem; at some point, you need the binary of wget to fetch the source code of wget – or anything else playing the same role as wget. The builtin mechanism had been introduced2 long time ago, see issue#22774 in 2016.

However, we might discuss the impact of this lost compared to the advantages. About the impact, it is let as an exercise for the reader; hint: the content is known beforehand and the checksum is verified. About the advantages, evaluating a past derivation with a recent Guix revision run all the recent tools; modulo corner cases discussed below.

When the download is not buitin builders, it is a script. For example, consider the package texlive-ae.

$ guix drv-show $(guix build -S -d texlive-ae) | recsel -p builder
builder:
+ /gnu/store/g49b4v7dff8xwfi7wpi8pps1ixhld3n7-guile-3.0.9/bin/guile
+ --no-auto-compile
+ -L /gnu/store/cx4gz6hhl9wjq1gl2bcgqxn48f8r2nkc-module-import
+ -L /gnu/store/p5f006jcr83jc7m731vhvjdkr2j0hnp3-guile-json-4.7.3/share/guile/site/3.0
+ -L /gnu/store/cca3nc8jddg90nmi9izasrjd8m2bxxny-guile-gnutls-3.7.14/share/guile/site/3.0
+ -L /gnu/store/pfd1723590y156i2amvxc10mj4723byj-guile-lzlib-0.0.2/share/guile/site/3.0
+ -C /gnu/store/q7w8mf4gih5m3isx67iy6dwzzpivd001-module-import-compiled
+ -C /gnu/store/p5f006jcr83jc7m731vhvjdkr2j0hnp3-guile-json-4.7.3/lib/guile/3.0/site-ccache
+ -C /gnu/store/cca3nc8jddg90nmi9izasrjd8m2bxxny-guile-gnutls-3.7.14/lib/guile/3.0/site-ccache
+ -C /gnu/store/pfd1723590y156i2amvxc10mj4723byj-guile-lzlib-0.0.2/lib/guile/3.0/site-ccache
+ /gnu/store/h14lh50nn7fwfqkxr0xnl8zap6ndi6vi-svn-multi-download

When building the derivation, the Guile script named svn-multi-download is run. If you scrutinize this script, the important part is the call of the procedure svn-fetch from the module (guix build svn). Therefore, the code that is executed is from the directory module-import (or module-import-compiled). We are able to audit all. Cool, isn’t it?

The double sword: yesterday, today or tomorrow, when you build this derivation, you always run this script using these inputs. Other said, in 2 years from now, you will need to build this exact Subversion then run this exact script. For instance, guix size subversion tells that it needs 34 dependencies. Therefore, you also need to build them all. The script might contain bugs. Over the past 5 years, this script had been modified 9 times.

So far, so good? I hope the question – how to download now all the source code required by the past? – is now becoming clearer.

Rewriting fixed-output derivation

That’s ambitious for the general case. Maybe because there is no one general case but many several specific cases. I count 6 methods: url-fetch, git-fetch, svn-fetch, svn-multi-fetch, hg-fetch, bzr-fetch, cvs-fetch and for some of them, they evolved over the time. Let be incremental.

Dealing with url-fetch method

The simplest case; a good start! As we said above, the builder is builtin:download without any inputs since 2016. Therefore, it covers all source code reachable by guix time-machine. What we have to do is just to extract relevant information to pass it to url-fetch procedure.

Guix is more than just a package manager or an operating system, it is a Scheme library too! Other said, it is easy to re-compose procedures. For instance, launch guix repl,

scheme@(guix-user)> ,use(guix derivations)
scheme@(guix-user)> (read-derivation-from-file "/gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv")
$1 = #<derivation /gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv => /gnu/store/3dq55rw99wdc4g4wblz7xikc8a2jy7a3-hello-2.12.1.tar.gz 7a140f453e10>
scheme@(guix-user)> (derivation-file-name $1)
$2 = "/gnu/store/2p9bkv91gr64nm17qq0byavwg5q6w2lg-hello-2.12.1.tar.gz.drv"
scheme@(guix-user)> (store-path-package-name $2)
$3 = "hello-2.12.1.tar.gz.drv"
scheme@(guix-user)> (basename $3 ".drv")
$4 = "hello-2.12.1.tar.gz"

and we have just extracted name that we can now pass to the procedure url-fetch from the module (guix download). Similarly, it is straightforward to extract hash, hash-algo and system. The last information is url. The derivation holds this inside the field environment variables; see this simple procedure to extract the associated value.

Do not miss that url-fetch returns a monadic value which needs run-with-store procedure.

Consider the fixed-output derivation from the package hello as it was on April 2020. We first show the environment variables and then transform it (with extension guix drv-drv).

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
       -- build -S -d hello                                              \
       | xargs guix drv-show | recsel -p environment
environment:
+ content-addressed-mirrors: /gnu/store/vwyxp1dq4lb97n6b20w5cqxasy2dai79-content-addressed-mirrors
+ impureEnvVars: http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS
+ mirrors: /gnu/store/h1nnwnrbwr7vcllyn1k0p55lkdz0clhh-mirrors
+ out: /gnu/store/hbdalsf5lpf01x4dcknwx6xbn6n5km6k-hello-2.10.tar.gz
+ preferLocalBuild: 1
+ url: "mirror://gnu/hello/hello-2.10.tar.gz"

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
       -- build -S -d hello                                              \
       | xargs guix drv-drv                                              \
       | xargs guix drv-show | recsel -p environment
environment:
+ content-addressed-mirrors: /gnu/store/wg1yp2vx8gb7qmcgyibqnwblahpp4bjg-content-addressed-mirrors
+ disarchive-mirrors: /gnu/store/0mxnx8l4fgigvd7gakwdk6hc6im4wnai-disarchive-mirrors
+ impureEnvVars: http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS
+ mirrors: /gnu/store/0sqi3rs694q1v36v5vdxqhjbrlip5vvn-mirrors
+ out: /gnu/store/hbdalsf5lpf01x4dcknwx6xbn6n5km6k-hello-2.10.tar.gz
+ preferLocalBuild: 1
+ url: "mirror://gnu/hello/hello-2.10.tar.gz"

We see that the variable content-addressed-mirrors had been replaced. It now contains the up-to-date version. Similarly for mirrors. The interesting part is about disarchive-mirrors: it did not exist at the time of revision 1971d1 (April 2020); or was at its infancy.

Obviously, building one or the other derivation outputs the exact same result because it is a fixed-output derivation. The difference between the two is how to download.

Packages from Bioconductor archive

Remember the issue with Bioconductor packages? The URL was wrong and it fails to download the source code.

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
       -- build -S r-flowcore
guile: warning: failed to install locale
substitute: updating substitutes from 'https://ci.guix.gnu.org'... 100.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 100.0%
The following derivation will be built:
   /gnu/store/8a1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv
building /gnu/store/8a1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv...

Starting download of /gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz
From https://bioconductor.org/packages/release/bioc/src/contrib/flowCore_1.52.1.tar.gz...
download failed "https://bioconductor.org/packages/release/bioc/src/contrib/flowCore_1.52.1.tar.gz" 404 "Not Found"

Starting download of /gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz
From https://bioconductor.org/packages/3.10/bioc/src/contrib/Archive/flowCore_1.52.1.tar.gz...
following redirection to `https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/3.10/bioc/src/contrib/Archive/flowCore_1.52.1.tar.gz'...
download failed "https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/3.10/bioc/src/contrib/Archive/flowCore_1.52.1.tar.gz" 404 "Not Found"

[...]

builder for `/gnu/store/8a1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv' failed to produce output path `/gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz'
build of /gnu/store/8a1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv failed
View build log at '/var/log/guix/drvs/8a/1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv.gz'.
guix build: error: build of `/gnu/store/8a1l5vg9i8nri5ym4ygk6bb6dnmq7zr7-flowCore_1.52.1.tar.gz.drv' failed

And from this blog post, remember the manifest file fixing that issue. The idea with the approach here is to avoid such manifest. Instead, the idea is to manipulate the fixed-output derivation. Somehow, the basic idea is expressed by this pipe:

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
       -- build -S r-flowcore -d                                         \
       | xargs guix drv-drv                                              \
       | xargs guix build
guile: warning: failed to install locale
The following derivation will be built:
  /gnu/store/wvfcjl7hm9c4dfmadm3rrxvv3v9qsjfk-flowCore_1.52.1.tar.gz.drv
building /gnu/store/wvfcjl7hm9c4dfmadm3rrxvv3v9qsjfk-flowCore_1.52.1.tar.gz.drv...

Starting download of /gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz
From https://bioconductor.org/packages/release/bioc/src/contrib/flowCore_1.52.1.tar.gz...
download failed "https://bioconductor.org/packages/release/bioc/src/contrib/flowCore_1.52.1.tar.gz" 404 "Not Found"

Starting download of /gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz
From https://bioconductor.org/packages/3.10/bioc/src/contrib/flowCore_1.52.1.tar.gz...
following redirection to `https://mghp.osn.xsede.org/bir190004-bucket01/archive.bioconductor.org/packages/3.10/bioc/src/contrib/flowCore_1.52.1.tar.gz'...
downloading from https://bioconductor.org/packages/3.10/bioc/src/contrib/flowCore_1.52.1.tar.gz ...
 flowCore_1.52.1.tar.gz  6.5MiB                                            168KiB/s 00:40 ▕██████████████████▏ 100.0%
successfully built /gnu/store/wvfcjl7hm9c4dfmadm3rrxvv3v9qsjfk-flowCore_1.52.1.tar.gz.drv
/gnu/store/lpn81kycn0h0g3m2hyjg45il2yc4m00f-flowCore_1.52.1.tar.gz

Other said, the idea reads: emit the old derivation, rewrite it to look recent, then build that.

Another illustration with git-fetch method

Consider the source code of a package coming from a Git repository. For instance, it reads:

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
        -- build -S -d txt2man                                           \
        | xargs guix drv-show

name: /gnu/store/28if75awpap8iwigggbg5f7w3kgz5kik-txt2man-1.6.0-checkout.drv
outputs:
+ /gnu/store/ba80al8r1yn09d0vy29h534igdlszdh7-txt2man-1.6.0-checkout   [out]
hash: 1razjpvlcp85hqli77mwr9nmn5jnv3lm1fxbbqjpx1brv3h1lvm5
inputs:
+ /gnu/store/l4ald6cgz9y8wbjsgxcf70wy5xwsp1n7-module-import.drv            [out]
+ /gnu/store/ps1km585clfi1sdmy52nasyxxzq9yn01-tar-1.32.drv                 [out]
+ /gnu/store/qm3l79ic89qpjjd8avqxd81425v4wvv5-gnutls-3.6.A.drv             [debug,doc,out]
+ /gnu/store/s5c0zxnj6qib3cipdm9mr5jyaiwblqrr-gzip-1.10.drv                [out]
+ /gnu/store/vspjsch7pp91qibjsiw1ahvw0iqangqy-module-import-compiled.drv   [out]
+ /gnu/store/xj0z5qz9bvml11hgy62qn9hp3g9x614g-git-minimal-2.26.0.drv       [out]
+ /gnu/store/y88s25rjwdic7x2ijfqn2mjszias0kvs-guile-2.2.6.drv              [out]
+ /gnu/store/z0syzwg3nq0zgxbr1b3qdq859s2yyz03-guile-json-3.2.0.drv         [out]
system: x86_64-linux
builder:
+ /gnu/store/sc7z07gim1iq5zvfz1amdwf2irxrzifg-guile-2.2.6/bin/guile
+ --no-auto-compile
+ -L /gnu/store/alr7zlcn1bmz3rjr8s8144n0r0h5g8mj-module-import
+ -L /gnu/store/ydwj3442a5zhfr6jb6yyghndbzcibgzl-guile-json-3.2.0/share/guile/site/2.2
+ -L /gnu/store/zr6i9jnfv2sw00r59kdpk2jgkj98k3rp-gnutls-3.6.A/share/guile/site/2.2
+ -C /gnu/store/i0lylv486bwk3pm647hr2dwz0sk3hl6l-module-import-compiled
+ -C /gnu/store/ydwj3442a5zhfr6jb6yyghndbzcibgzl-guile-json-3.2.0/lib/guile/2.2/site-ccache
+ -C /gnu/store/zr6i9jnfv2sw00r59kdpk2jgkj98k3rp-gnutls-3.6.A/lib/guile/2.2/site-ccache
+ /gnu/store/xni636wm7dwb7gmjx03b8k2acd460ncg-git-download
environment:
+ git commit: txt2man-1.6.0
+ git recursive?: #f
+ git url: https://github.com/mvertes/txt2man
+ impureEnvVars: http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS
+ out: /gnu/store/ba80al8r1yn09d0vy29h534igdlszdh7-txt2man-1.6.0-checkout
+ preferLocalBuild: 1

Ah, it means that git-minimal at version 2.26 is required and that the builder script named git-download is from 4 years old (April 2020). Similarly as previously, it is possible to extract all the relevant information and then build another fixed-output derivation with all more recent. Guess what? The extension guix drv-drv also does that.

$ guix time-machine -q --commit=1971d11db9ed9683d5036cd4c62deb564842e1f6 \
       -- build -S -d txt2man                                            \
       | xargs guix drv-drv                                              \
       | xargs guix drv-show

name: /gnu/store/vk367ywg4d5n3m1qy1qi2vq7mpg00cjn-txt2man-1.6.0-checkout.drv
outputs:
+ /gnu/store/ba80al8r1yn09d0vy29h534igdlszdh7-txt2man-1.6.0-checkout   [out]
hash: 1razjpvlcp85hqli77mwr9nmn5jnv3lm1fxbbqjpx1brv3h1lvm5
inputs:
system: x86_64-linux
builder:
+ builtin:git-download
environment:
+ commit: txt2man-1.6.0
+ impureEnvVars: http_proxy https_proxy LC_ALL LC_MESSAGES LANG COLUMNS
+ out: /gnu/store/ba80al8r1yn09d0vy29h534igdlszdh7-txt2man-1.6.0-checkout
+ preferLocalBuild: 1
+ recursive?: #f
+ url: "https://github.com/mvertes/txt2man"

Oh, awesome! The builder script is replaced by the builtin:git-download introduced3 some months ago. Being builtin, i.e., daemon side, removes transparency but mitigated by checksum verification and more importantly: it simplifies the annoyances when traveling back in time.

The easy ends here but the cool continues.

TODO Work in progress

Wow, this post is already very long. Let me summarize the remaining work.

  1. [ ] Deal with Subversion.

    The easy part is already done. What is the complicated part? For some derivation, the information required by Subversion as url or revision does not directly appear in the derivation itself as environment variables but that information is only inside the builder script. Therefore, it means automatically read the Scheme file and extract it.

    Example texlive-ae at Guix revision 1971d11d.

  2. [ ] Deal with Mercurial.

    Similarly as 1. and in addition the example pops up a consideration about grafts. The old fixed-output derivation must be generated without grafts.

    Example seek at Guix revision 1971d11d.

  3. [ ] Deal with the rest (CVS, Bazaar)

    I have not look into that yet.

  4. [ ] Glue all!

    It would nice to have a Guix extension, say

    guix revive -C channels.scm -m manifest.scm
    

    which automatically builds inferiors, generates the recent form of all fixed-output derivations required to build the computational environment and download them. Therefore all probably depends on substitutes availability.

What a plan, isn't it? Comments are very welcome.

Join the fun, join Guix for reproducible research!

Footnotes:

1

For various reasons, guix build -S -d might not return the fixed-output derivation but a derivation – e.g., when the origin uses patches or snippet. In that cases, the fixed-output derivation is not accessible from the Guix command-line.

2

Could we solve the problem differently and keep more transparency and verifiable computations? Yes, I think. By introducing this builtin:download only for bootstrapping say wget or whatever else and then for all the rest rely on builder scripts using this bootstrapped wget or anything else.

3

I am not convinced it is an improvement. Similarly as builtin:download, the cycle could be resolved differently. Anyway.


© 2014-2024 Simon Tournier <simon (at) tournier.info >

(last update: 2024-10-01 Tue 12:27)