Redoing one paper from ReScience C back on 2020

Paper from Ten Years Challenge: volume 6, issue 1, number 6


Note: To our knowledge, rebuilding all – starting with the most minimal binary footprint – and all from source for transparency using almost exclusively the code archived in Software Heritage is impossible, except using Guix.



We think that Guix is a suitable framework for running scientific computations. The aim of this post is to spot out the roadblocks between Guix and such robust scientific computations. Here, robust means that two independent observers are able to verify the same result. Especially, we have underlined in this Café Guix which conditions Guix must have at hand for allowing reproducible computational deployment environment:

The core question is: what is the temporal window size when all these 4 conditions hold? To my knowledge, the Guix project is unique in experimenting for real about this window size since v1.0 in 2019. This post is thus a concrete real example showing what is missing – and what is not! – with three years between the two observations (2020-2023).

Use case scenario

Let consider some gold standard about an old paper (2006) replicated in 2020 [Re] Storage Tradeoffs in a Collaborative Backup Service for Mobile Devices by Ludovic Courtès and published as part of the Ten Years Reproducibility Challenge organized by the online journal ReScience C. For details, please give a look at the PDF and the Git repository of this article.

Gold standard because this paper uses Guix end-to-end. The command-line,

$ guix time-machine -C channels.scm -- build -f guix.scm

should compile all the requirements, run all the experiments and last generate the final report. Somehow, we are considering two parts:

  • guix time-machine which is purely Guix-specific,
  • -- build which uses Guix for compiling, running and generating the report.

Note: Please consider that running today (2023) the command-line above will generate the same computational environment as it was (2020). By default and considering the current state of the all various servers, it just works out-of-the-box. And that’s awesome!

Extreme worst-case setup

Because robustness is the key when speaking about reproducibility, we stretch our attempt by assuming that all the various servers have disappeared and are unreachable.

Other said, we only assume that Software Heritage – universal software archive – is available. Since Software Heritage removes some metadata – e.g., compressor information – to archive only the content, then the tarball that Software Heritage returns does not necessary match the checksum known at package time – because of that missing metadata. Disarchive builds a database containing this metadata, and thus a map from this checksum to the content stored in Software Heritage. Using this Disarchive database and the content archived in Software Heritage, Guix is able to rebuild the exact same source (tarball). Concretely, we disable and stop systemd-resolved.service and manually set on these two servers,

128.93.166.15 archive.softwareheritage.org
141.80.181.40 disarchive.guix.gnu.org

Therefore, we are going to check if the combination Software Heritage + Disarchive is operational for rebuilding from scratch more than 422 source codes. Our aim is to identify the holes for fixing them.

The first annoyance is that guix time-machine needs an access to the server git.savannah.gnu.org, although the Git repository is already cloned and already contains the required commit. For instance,

$ guix describe
Generation 25   mai 19 2023 13:30:14    (current)
  guix 14c0380
    repository URL: https://git.savannah.gnu.org/git/guix.git
    branch: master
    commit: 14c03807ba4bc81d42cf869f5b827f7da54ff843

$ guix time-machine --commit=14c0380 -- describe
guix time-machine: error: Git error: failed to resolve address for git.savannah.gnu.org: Name or service not known

$ git -C ~/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq \
      show 14c0380 | grep commit
commit 14c03807ba4bc81d42cf869f5b827f7da54ff843

Another annoyance is about the source of some packages deep in the graph of dependencies: ed, gcc-core@2.95.3, ghostscript, guile@2.2.6, linux-libre, linux-libre-headers-stripped, mes, mes-minimal-stripped, etc. as explained in this thread on guix-devel mailing list. These packages are part of the bootstrap and their source code is some tarball archive located in ftp.gnu.org. We already know that the coverage is weak here.

In addition to the two servers allowed above, these two severs are also manually added,

209.51.188.168 git.savannah.gnu.org
209.51.188.20 ftp.gnu.org

That’s said, let’s go!

Running guix time-machine -C channels.scm

Issue 1 and 2: Options --fallabck and --no-substitutes

We start by the simplest: run one previous version of Guix. This version is described by the file channels.scm. This file contains two channels at pinned revisions. For instance, it specifies the commit 40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413 from April 24th, 2020. It means that using the current Guix revision 14c0380 installed (pulled) on my machine on May 19th 2023, this command about guix time-machine should display the help message as it was on April 24th, 2020.

$ guix time-machine --commit=40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413 -- help
Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%guix substitute: warning: ci.guix.gnu.org: host not found: Name or service not known
substitute:
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
substitute:
retrying download of '/gnu/store/…-config.scm' with other substitute URLs...
guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
guix substitute: error: failed to find alternative substitute for '/gnu/store/…-config.scm'
substitution of /gnu/store/…-config.scm failed
building /gnu/store/…-config.scm.drv...
guix time-machine: error: some substitutes for the outputs of derivation `/gnu/store/…-module-import-compiled.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source

Ah, an error! This first issue is annoying and the fallback should be transparent. Trying the recommendation --fallback with the manual invocation reads,

$ guix time-machine --commit=40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413 --fallback -- help
[...]
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
retrying download of '/gnu/store/…-module-import-compiled' with other substitute URLs...
guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
guix substitute: error: failed to find alternative substitute for '/gnu/store/…-module-import-compiled'
substitution of /gnu/store/…-module-import-compiled failed
guix time-machine: error: corrupt input while restoring archive from #<closed: file 7fa4f6e798c0>

Ah, another error! This second issue is also annoying because the previous hint is inaccurate. Instead of --fallback, let use --no-substitutes and now it starts… and fails but for other reasons we are going to investigate.

Issue 3: Non consistent message for substitutes and/or fallback

In addition to the Guix channel itsefl, we consider all the channels listed by the file channels.scm. The command-line guix time-machine -C channels.scm is generating the old versions of Guix itself and of the other channels, both corresponding to the exact same state as it was on April 24th, 2020. The output should be the help message. Instead, it displays:

$ guix time-machine -C channels.scm -- help
Updating channel 'guix-past' from Git repository at 'https://gitlab.inria.fr/guix-hpc/guix-past.git'...
SWH: found revision 4c3923dc0114f4669fbd99c5a09a443d3eb5f4d6 with directory at 'https://archive.softwareheritage.org/api/1/directory/057e6655d7240135862f9dd7da59c75d64db34b8/'
SWH vault: requested bundle cooking, waiting for completion...
swh:1:rev:4c3923dc0114f4669fbd99c5a09a443d3eb5f4d6.git/
[...]
Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'...
WARNING: (guix build emacs-build-system): imported module (guix build utils) overrides core binding `delete'
Computing Guix derivation for 'x86_64-linux'... |@ substituter-started /gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc substitute
retrying download of '/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc' with other substitute URLs...
guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
guix substitute: error: failed to find alternative substitute for '/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc'
@ substituter-failed /gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc  fetching path `/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc' (empty status: '')
Backtrace:
          13 (primitive-load "/gnu/store/hbnmq1p5pnj55id6547h2bhvs15z4lg2-compute-guix-derivation")
In ice-9/eval.scm:
    155:9 12 (_ _)
    159:9 11 (_ #(#(#(#(#(#(#(#(#(#(#(#(#(#<directory (guile-user) 7f5082df7c?> ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?))
In ./guix/store.scm:
  1975:24 10 (run-with-store #<store-connection 256.99 7f5082e39460> #<procedure 7f5074b81980 at ./guix/self.scm:11?> ?)
   1811:8  9 (_ #<store-connection 256.99 7f5082e39460>)
In ./guix/gexp.scm:
    961:2  8 (_ #<store-connection 256.99 7f5082e39460>)
    821:2  7 (_ #<store-connection 256.99 7f5082e39460>)
In ./guix/store.scm:
  1859:12  6 (_ #<store-connection 256.99 7f5082e39460>)
   1312:5  5 (map/accumulate-builds #<store-connection 256.99 7f5082e39460> #<procedure 7f5074c2a540 at ./guix/stor?> ?)
  1323:15  4 (_ #<store-connection 256.99 7f5082e39460> ("/gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscrip?" ?) ?)
  1323:15  3 (loop #f)
   711:11  2 (process-stderr #<store-connection 256.99 7f5082e39460> _)
In ./guix/serialization.scm:
   101:11  1 (read-int #<input-output: file 10>)
     79:6  0 (get-bytevector-n* #<input-output: file 10> 8)

./guix/serialization.scm:79:6: In procedure get-bytevector-n*:
ERROR:
  1. &nar-error:
      file: #f
      port: #<input-output: file 10>
guix time-machine: error: You found a bug: the program '/gnu/store/hbnmq1p5pnj55id6547h2bhvs15z4lg2-compute-guix-derivation'
failed to compute the derivation for Guix (version: "40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413"; system: "x86_64-linux";
host version: "14c03807ba4bc81d42cf869f5b827f7da54ff843"; pull-version: 1).
Please report it by email to <bug-guix@gnu.org>.

Before the backtrace, it is awesome! The Git repository of the channel guix-past is unreachable but the content is available in Software Heritage. Transparently, Guix clones from Software Heritage. Great job!

However, this third issue is a bit more cryptic than previously. Well, let try the option --fallback… similar error. So, let try the option --no-substitutes.

Issue 4: Missing source in Software Heritage and/or Disarchive database

Now it almost passes – needing the server ftp.gnu.org as noted above. It still fails because linux-libre-4.19.56-gnu.tar.xz.drv cannot be built, other said, the source

("url",
"(\"https://linux-libre.fsfla.org/pub/linux-libre/releases/4.19.56-gnu/linux-libre-4.19.56-gnu.tar.xz\"
  \"ftp://alpha.gnu.org/gnu/guix/mirror/linux-libre-4.19.56-gnu.tar.xz\"
  \"mirror://gnu/linux-libre/4.19.56-gnu/linux-libre-4.19.56-gnu.tar.xz\")")

is unreachable. Let temporarily setup the network, and build this derivation,

$ guix build /gnu/store/qwbmqzyqv8nl39pkmzyp268lcnjrhrvs-linux-libre-4.19.56-gnu.tar.xz.drv
 101,7 MB will be downloaded:
   /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz
 substituting /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz...
 downloading from https://ci.guix.gnu.org/nar/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz ...
  linux-libre-4.19.56-gnu.tar.xz  96.9MiB 2.4MiB/s 00:40 ▕██████████████████▏ 100.0%

 /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz

and run again the guix time-machine --no-substitutes command line. It still fails because nyacc (mirror://savannah/nyacc/nyacc-0.86.0.tar.gz) is missing. The source of net-tools-1.60-0.479bb4a.zip is also missing. And the ones of guile-git-0.3.0.tar.gz, guile-json-3.2.0.tar.gz, libuv-v1.30.1.tar.gz, rhash-1.3.8.tar.gz, scons-3.0.4-checkout, zstd-1.4.2.tar.gz, doxygen-1.8.15.src.tar.gz, flake8-3.7.7.tar.gz, hypothesis-4.18.3.tar.gz, more-itertools-7.1.0.tar.gz, pluggy-0.11.0.tar.gz, pytest-4.4.2.tar.gz (why Python packages are they required for building Guile program?), fonttools-3.38.0.zip, gobject-introspection-1.60.2.tar.xz, pbr-3.0.1.tar.gz, selinux-20170804-checkout, yelp-tools-3.28.0.tar.xz, yelp-xsl-3.32.1.tar.xz, po4a-0.57.tar.gz, fontforge-20190801.tar.gz, libspiro-dist-0.5.20150702.tar.gz, libuninameslist-dist-20190701.tar.gz, ruby-2.5.3.tar.xz, teckit-2.5.9.tar.gz, texlive-20180414-extra.tar.xz. Again, let temporarily build the derivations and repeat. Another source is missing: static-binaries. The fourth issue is about holes in Software Heritage and Disarchive coverage. Please note that's few holes compared to the hundreds of required source code.

Issue 4 bis: Missing support of Subversion as Software Heritage fallback

Guix is not able to use Software Heritage when the version control system of the source code is Subversion. It’s known and we hit it!

svn: E170013: Unable to connect to a repository at URL 'svn://www.tug.org/texlive/tags/texlive-2018.2/Master/texmf-dist/source/generic/hyph-utf8'
svn: E670003: Unknown hostname 'www.tug.org'
Backtrace:
           2 (primitive-load "/gnu/store/dkx38h7m7c4gani34y025gcq8ym?")
In guix/build/svn.scm:
     39:2  1 (svn-fetch _ _ _ #:svn-command _ #:recursive? _ # _ # _)
In guix/build/utils.scm:
    652:6  0 (invoke _ . _)

guix/build/utils.scm:652:6: In procedure invoke:
Throw to key `srfi-34' with args `(#<condition &invoke-error [program: "/gnu/store/mk7hgz801cv730gfx63mv8z9wjzfs0jb-subversion-1.10.6/bin/svn" arguments: ("export" "--non-interactive" "--trust-server-cert" "-r" "49435" "svn://www.tug.org/texlive/tags/texlive-2018.2/Master/texmf-dist/source/generic/hyph-utf8" "/gnu/store/082v60by6rf5y0ai2jda9jv5bffdlcri-hyph-utf8-scripts-49435-checkout") exit-status: 1 term-signal: #f stop-signal: #f] 7ffff0014f80>)'.
builder for `/gnu/store/ik58fsf7j2h2n19p1hk422a7hvizj2pa-hyph-utf8-scripts-49435-checkout.drv' failed with exit code 1

Therefore, we add 46.4.94.215 www.tug.org as the allowed network. It eases the download all the source of TeXlive packages required by the documentation.

Issue 5: Bootstrapping

Finally Guix starts building from the bootstrap. But it fails with the test suite of tcc-boot0-0.9.26-6.c004e9a.drv,

starting phase `check'
t: [FAIL]
02-return-1: [FAIL]
05-call-1: [FAIL]
07-include: [FAIL]
54-argc: [FAIL]
70-strchr: [FAIL]
91-fseek: [FAIL]
92-stat: [FAIL]
99-readdir: [FAIL]
22_floating_point: [FAIL]
23_type_coercion: [FAIL]
24_math_library: [FAIL]
34_array_assignment: [FAIL]
49_bracket_evaluation: [FAIL]
55_lshift_type: [FAIL]
expect: 14
failed: 15
passed: 209
total:  224
FAILED: 15/224
command "sh" "check.sh" failed with status 1

As a workaround, we fetch the substitutes for this derivation. Repeat the same command-line. Now, it fails about diffutils-mesboot-2.7.drv, then binutils-mesboot0-2.20.1a.drv, then gcc-core-mesboot-2.95.3.drv, then glibc-mesboot0-2.2.5.drv, then gcc-mesboot0-2.95.3.drv, then binutils-mesboot-2.20.1a.drv, etc. Well, we stop here. The fifth issue is about bootstrapping: it is not robust.

In order to bypass, setting on the substitutes using the network, these commands are run,

guix build /gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscript-9.27.drv --no-grafts
guix build /gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscript-9.27.drv --no-grafts --check

which allow to populate the store.

Issue 6: Uncovered patches as source

Continuing the same procedure, we get this sixth issue:

failed to download "/gnu/store/i3avflhlz20ampw6v21s0wmqx0527xyi-icu4c-datetime-regression.patch"
from "https://github.com/unicode-org/icu/commit/7788f04eb9be0d7ecade6af46cf7b9825447763d.patch"

Similarly, the derivation icu4c-64.2.drv is built and checked. Then, again a very similar issue is hit with icu4c-datetime-regression.patch, icu4c-locale-mapping.patch and we manually run the derivation icu4c-datetime-regression.patch.drv, icu4c-locale-mapping.patch.drv.

This issue is fixed by patch#62036 for the current Guix revisions but not yet for the past ones. The Guix way for feeding the Software Heritage archive should be improved here.

Issue 7: Time bomb

The seventh issue is a time bomb with the package gnutls-3.6.A. The test suite fails,

./scripts/common.sh: line 81: datefudge: command not found

You need datefudge to run this test
SKIP gnutls-cli-invalid-crl.sh (exit status: 77)

because the current time (June 2023) is unexpected by the test suite from 2020. Let locally and temporarily reset the time,

sudo timedatectl set-ntp false
sudo timedatectl set-time '2020-06-23 00:00:00'

and then reset back the local time to the current one after building the derivation.

$ guix build /gnu/store/qm3l79ic89qpjjd8avqxd81425v4wvv5-gnutls-3.6.A.drv --no-substitutes -q
/gnu/store/rvs9n58xvz6xpk4ri658shcj3h9kznvy-gnutls-3.6.A-debug
/gnu/store/467xibzigp01g79vj11r3xycyjkwiq42-gnutls-3.6.A-doc
/gnu/store/zr6i9jnfv2sw00r59kdpk2jgkj98k3rp-gnutls-3.6.A
$ sudo timedatectl set-ntp true

The same error happens for openssl-1.1.1g and the same workaround works. The test suite of libgit2 fails with:

  1) Failure:
refs::revparse::date [/tmp/guix-build-libgit2-1.0.0.drv-0/libgit2-1.0.0/tests/refs/revparse.c:31]
  Function call succeeded: error
  no error, expected non-zero return

and the same workaround temporarily changing the local time allows to build libgit2.

In summary, we hit 3 time bombs from the test suites.

Issue 8: Hash mismatch between Guix and Software Heritage normalization

The eight issue is about a hash mismatch for lz4-1.9.2-checkout. It means that the content archived in Software Heritage is different from the content used at package time.

Trying to download from Software Heritage...
SWH: found revision fdf2ef5809ca875c454510610764d9125ef2ebbd with directory at 'https://archive.softwareheritage.org/api/1/directory/8c4c3cacf90599887a5b02a46ec6f052f4422ef0/'
swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/
swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/
swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/config.yml
tar: swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/config.yml: time stamp 2023-06-26 16:34:17 is 2854.263082558 s in the future
[...]
tar: swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0: time stamp 2023-06-26 16:34:20 is 2857.251556658 s in the future
r:sha256 hash mismatch for /gnu/store/asvjidjr20hniips512mva8jrfd2zmy0-lz4-1.9.2-checkout:
  expected hash: 0lpaypmk70ag2ks3kf2dl4ac3ba40n5kc1ainkp9wfjawz76mh61
  actual hash:   0nygwna2sqa5jbsj51m6v5jznkgvwprkkznpdghc7y736fbq18lj
hash mismatch for store item '/gnu/store/asvjidjr20hniips512mva8jrfd2zmy0-lz4-1.9.2-checkout'

The issue is probably related to bug#61910 about CR/LF (end of line). Other said, Software Heritage applies normalization and thus Guix must restore the state without this normalization, generally using the file .gitattributes. That behaviour was not implemented back on 2020 and some corner cases must be carefully checked.

Running -- build -f guix.scm

Now we have the exact same old version of Guix specified by the file channels.scm, we are able to use it in order to build the file guix.scm.

Issue 2: Option --no-substitutes

The command-line reads,

$ guix time-machine -C channels.scm --no-substitutes -- build -f guix.scm
guile: warning: failed to install locale
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%guix substitute: warning: ci.guix.gnu.org: host not found: Name or service not known
substitute:
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
substitute:
The following derivations will be built:
   /gnu/store/zyaczc476w26yqirl2gvg93j1k53mggp-wget-1.20.3.drv
   /gnu/store/2cpii8d00p3apcwsbdii1afhflk34g8j-perl-uri-1.76.drv
   /gnu/store/3v2figjf83iyll9wgzxqwgrri78fjqzk-URI-1.76.tar.gz.drv
   /gnu/store/wbz3g7f5i6y4rbwvy6zl8mnfavv7mq4p-perl-test-needs-0.002005.drv
   /gnu/store/l0vmj9qbsg79xcmdzzx2zdkn7n0x52h7-Test-Needs-0.002005.tar.gz.drv
   /gnu/store/4jskgq9bjm7dads1rwy0crfhn9mni890-perl-http-date-6.05.drv
   /gnu/store/6ksf611qw3v3bgqwm25jz3b5pp5dmmwr-wget-1.20.3.tar.lz.drv
   /gnu/store/6qws54m0gwlkkzy7a0p8iqpa5vdcmr64-perl-io-socket-ssl-2.066.drv
   /gnu/store/pyzm2z8vhin0h8xgvjwfkl61njvqmx1n-perl-net-ssleay-1.88.drv
   /gnu/store/b355vzfv85rpqc6idiyz8m9wa0maymk7-Net-SSLeay-1.88.tar.gz.drv
   /gnu/store/w1fj4ryrrhj5v16w62gkba1gnil7h6jm-IO-Socket-SSL-2.066.tar.xz.drv
   /gnu/store/988raa22zg4gz08n36a92q3cfx13176g-IO-Socket-SSL-2.066.tar.gz.drv
   /gnu/store/akcazb652j2kw67cgyq1g04qjcgd1927-perl-http-daemon-6.01.drv
   /gnu/store/144ak25554ck634b34va8jsay61z7yjg-HTTP-Daemon-6.01.tar.gz.drv
   /gnu/store/i8chh5qdhzna6qn097hafyz0vixilljk-perl-lwp-mediatypes-6.04.drv
   /gnu/store/b6v1cmfifbzg9r4z644b6gimg779nd55-perl-test-fatal-0.014.drv
   /gnu/store/zas9c8sywjr1nlafj9847y5sylaxlfiz-perl-try-tiny-0.30.drv
   /gnu/store/vvzg062yk4v83kv49w1zm66ga6jgj0pn-perl-http-message-6.18.drv
   /gnu/store/w1lcr2cyv45wmhk37q4rp7vb7faq36bz-perl-encode-locale-1.05.drv
   /gnu/store/xkpxa5f7w1v5cyqj57935cq80nhd701x-perl-io-html-1.00.drv
   /gnu/store/j4qazlzhica2azkb8z942086c777j8nd-libpsl-0.21.0.drv
   /gnu/store/cxivzld51l9f3zljgwlff77c1d0wz2if-libpsl-0.21.0.tar.gz.drv
   /gnu/store/qi2abjyqrvppi7xxp78xcn5lscgkp8nc-libchop-0.0.2006-0.feb8f6b.drv
   /gnu/store/16ag6jqwk9q4kw35alwiskhhz6f09xzd-guile-1.8.8.drv
   /gnu/store/0al2zprwkynq1mcjhxxzazfl5306f9x1-guile-1.8.8.tar.xz.drv
   /gnu/store/w9sf7zgw7cqanf5pjj073zp98dc2wlpl-guile-1.8.8.tar.gz.drv
   /gnu/store/1pxzlf7239d976zksg3z1a7djfxjj7v8-bdb-6.2.32.drv
   /gnu/store/1py5rdgmapdw7xqv5w5z1dk4l9hw6cwh-db-6.2.32.tar.gz.drv
   /gnu/store/89q7yf6jsyhhc0m3zfgyaxg09lfhcpy9-libtool-1.5.22.drv
   /gnu/store/rk31bw6vah9cigjsgfz7zp7az474v0d9-libtool-1.5.22.tar.gz.drv
   /gnu/store/8j1w842q07m8l8iw089b3x0kjkhvcz32-rpcsvc-proto-1.4.drv
   /gnu/store/8qvgp5z785mcw61h4afnhp6mah5hy967-automake-1.9.6.drv
   /gnu/store/77iiskpyhkslldqmq6cffl3lkdp67f0l-automake-1.9.6.tar.gz.drv
   /gnu/store/axfh3j24c0xhna3pwsp0d87kwr4wi23a-autoconf-2.59.drv
   /gnu/store/76dfv649gmpmq4a1bfy37bdsz8pzc5nr-autoconf-2.59.tar.gz.drv
   /gnu/store/f1qc11kclgkjl2jazszbjs2ilmii6ycl-libtirpc-minimal-1.2.5.drv
   /gnu/store/hb0d7j0jyhjzxy5fkipb3hjc19gcpp37-tdb-1.4.3.drv
   /gnu/store/j6pr7bxypknbj77k4ha59gm4plrml8vv-gperf-3.0.4.drv
   /gnu/store/hpf0ks26v7h0i44cczhbl75ms832qndk-gperf-3.0.4.tar.gz.drv
   /gnu/store/kb9j26232db2bdx5s52dwm3la7ygam25-libchop-0.0.2006-0.feb8f6b.tar.xz.drv
   /gnu/store/62lm0pq5vr2fya7brjh2lsdppvi8fihq-libchop-0.0.2006-0.feb8f6b-checkout.drv
   /gnu/store/9c7115yfbdg9bk5ywkn44rph0vq550rj-module-import.drv
   /gnu/store/6ax9kkgagww8xm11554rfvpy346j5vhh-config.scm.drv
   /gnu/store/9zyvbxa9i5q3winl1x6fjsdfkil94za7-module-import-compiled.drv
   /gnu/store/ly9ymvzh2lg39z10xbhd9wbjqmv0sg9m-e2fsprogs-1.45.6.drv
   /gnu/store/696rz8qbnk62igvna9bdzi25k9z9gkf8-e2fsprogs-1.45.6.tar.xz.drv
   /gnu/store/jyr2a8v4149zc4frqpa780cmf1s9dh6z-procps-3.3.16.drv
   /gnu/store/bdawiywqk44kq93gq9dxm44bvk370b03-procps-ng-3.3.16.tar.xz.drv
   /gnu/store/mjk671mfgrfhbssf4xl0alnscw9p4lnc-g-wrap-guile18-1.9.7.drv
   /gnu/store/j5fzdhfgifwk6n4y22afq813s5panbi1-g-wrap-1.9.7.tar.gz.drv
   /gnu/store/r7gggq1z75s23kif6pypc2w5vvz44pfp-guile1.8-lib-0.1.3.drv
   /gnu/store/higni0l3kyg40achyq9zz7dzzn82z1w0-guile1.8-lib-0.1.3-checkout.drv
   /gnu/store/w26azcw027pd1x252pc3v0kv0p91l843-texinfo-4.13a.drv
   /gnu/store/l5bp1h43s34h17mrcdzqhqnbp5y8ixjy-texinfo-4.13a.tar.lzma.drv
1,6 MB will be downloaded:
   /gnu/store/mppxcmw3iwcl3kd5azr48m5858nqb2f6-tdb-1.4.3.tar.gz
   /gnu/store/bg6jwbml0h7k3da69asqddw4ciy7hkq5-libtirpc-1.2.5.tar.bz2
   /gnu/store/hk0lfyvkv0721g5nc347r4g9jh3rczbv-rpcsvc-proto-1.4.tar.xz
   /gnu/store/i4z59xblbn891z4y202nxvxy7y6hyb6a-IO-HTML-1.00.tar.gz
   /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz
   /gnu/store/9ii5xjpwbkq047n9p4gyz0scj9r1h6wv-HTTP-Message-6.18.tar.gz
   /gnu/store/2jqvj76fy8rrsi5vnal381hid4hdy170-LWP-MediaTypes-6.04.tar.gz
   /gnu/store/0llcr023q9dxdkr685kdc7nlq0ppsm89-Try-Tiny-0.30.tar.gz
   /gnu/store/56qijsj09i9awkzpd1masb0mq43bcfz5-Test-Fatal-0.014.tar.gz
   /gnu/store/wm3l2cblmzry265v849g36f23ms829qh-HTTP-Date-6.05.tar.gz
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
[...]
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
substituting /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz...
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
[...]
substitute: updating substitutes from 'https://ci.guix.gnu.org'...   0.0%
substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'...   0.0%
retrying download of '/gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz' with other substitute URLs...
guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known
guix substitute: error: failed to find alternative substitute for '/gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz'
substitution of /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz failed
guix build: error: corrupt input while restoring archive from #<closed: file 7f3ea3ae9070>

with the same error as previously. We need to pass the option --no-substitutes to the build action.

Issue 4: Missing source in Software Heritage and/or Disarchive database

These packages are missing g-wrap-1.9.7.tar.gz, libtirpc-1.2.5.tar.bz2, tdb-1.4.3.tar.gz, guile-cairo-1.10.0.tar.gz, guile-lib-0.2.6.1.tar.gz and guile-charting-0.2.0.tar.gz. For all of them, the content seems stored in Software Heritage but because they are compressed archives, their intrinsic identifier (integrity checksum) is not enough here for finding back this content.

Consider one specific example. The paper requires the Lout batch document formatter at various versions: from 3.20 to 3.29. All these ten versions are compressed tarballs. What is interesting is that, for 8 of them, the compressed tarball is downloaded from Software Heritage without any specific problem. For instance, version 3.21,

Starting download of /gnu/store/85rzg535gscw7sf6q4wrlnq1yq3v0xzk-lout-3.21.tar.gz
From https://archive.softwareheritage.org/api/1/content/sha256:098467c7f747cf5bd1cf966270384d0c3f8c795b843cbb0304728e118909b7ce/raw/...
downloading from https://archive.softwareheritage.org/api/1/content/sha256:098467c7f747cf5bd1cf966270384d0c3f8c795b843cbb0304728e118909b7ce/raw/ ...
 raw/  1.7MiB                                                                                                     10.3MiB/s 00:00 ▕██████████████████▏ 100.0%
successfully built /gnu/store/9cip8x8hql06mw2vaj0vh59k8ka435cr-lout-3.21.tar.gz.drv

However, for the versions 3.20 and 3.28, it fails with:

Starting download of /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz
From https://archive.softwareheritage.org/api/1/content/sha256:af62b850b8b410d427049f1152fa0217fc7ed77d4cd3ec73e0f30e2aa644926b/raw/...
download failed "https://archive.softwareheritage.org/api/1/content/sha256:af62b850b8b410d427049f1152fa0217fc7ed77d4cd3ec73e0f30e2aa644926b/raw/" 404 "Not Found"

Starting download of /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz
From https://web.archive.org/web/20230630183701/http://download.savannah.gnu.org/releases/lout/lout-3.20.tar.gz...
In procedure getaddrinfo: Name or service not known
Trying to use Disarchive to assemble /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz...
could not find its Disarchive specification
failed to download "/gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz" from "mirror://savannah/lout/lout-3.20.tar.gz"
builder for `/gnu/store/y8lg3kdd12if1rnhfqdipcmrg16n98lj-lout-3.20.tar.gz.drv' failed to produce output path `/gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz'
build of /gnu/store/y8lg3kdd12if1rnhfqdipcmrg16n98lj-lout-3.20.tar.gz.drv failed

Other said, it is not straightforward to have the guarantee for a robust coverage with a fallback to a supported archive as Software Heritage. It is where Disarchive shines and needs a lot of love! Here the Disarchive specification is missing, hence the hole.

Not an issue: access to the data

All the previous Lout versions are considered as data for the paper. The paper also consider other data as ten .ogg files. These files are not content-addressed (intrinsic identifier) although the paper provides integrity checksum. Other said, if this URL is gone then the result of the paper could not be verified. Well, that’s another story.

And we are done!

Yeah, we get the report that looks very similar as the PDF. Awesome!

Jun the fun, join Guix in scientific context!

Again, we did a tour around the pieces which need some love; we have focused on the broken corner cases because they are visible. Please note all the other is invisible and just runs out of the box with Guix. That’s impressive!


© 2014-2024 Simon Tournier <simon (at) tournier.info >

(last update: 2024-11-01 Fri 11:31)