Redoing one paper from ReScience C back on 2020
Paper from Ten Years Challenge: volume 6, issue 1, number 6
Note: To our knowledge, rebuilding all – starting with the most minimal binary footprint – and all from source for transparency using almost exclusively the code archived in Software Heritage is impossible, except using Guix.
We think that Guix is a suitable framework for running scientific computations. The aim of this post is to spot out the roadblocks between Guix and such robust scientific computations. Here, robust means that two independent observers are able to verify the same result. Especially, we have underlined in this Café Guix which conditions Guix must have at hand for allowing reproducible computational deployment environment:
- all the source code
- backward-compatibility of the Linux kernel
- some compatibility of the hardware (CPU, etc.)
- no time bomb (hidden in the test suite or other)
The core question is: what is the temporal window size when all these 4 conditions hold? To my knowledge, the Guix project is unique in experimenting for real about this window size since v1.0 in 2019. This post is thus a concrete real example showing what is missing – and what is not! – with three years between the two observations (2020-2023).
Use case scenario
Let consider some gold standard about an old paper (2006) replicated in 2020 [Re] Storage Tradeoffs in a Collaborative Backup Service for Mobile Devices by Ludovic Courtès and published as part of the Ten Years Reproducibility Challenge organized by the online journal ReScience C. For details, please give a look at the PDF and the Git repository of this article.
Gold standard because this paper uses Guix end-to-end. The command-line,
$ guix time-machine -C channels.scm -- build -f guix.scm
should compile all the requirements, run all the experiments and last generate the final report. Somehow, we are considering two parts:
guix time-machine
which is purely Guix-specific,-- build
which uses Guix for compiling, running and generating the report.
Note: Please consider that running today (2023) the command-line above will generate the same computational environment as it was (2020). By default and considering the current state of the all various servers, it just works out-of-the-box. And that’s awesome!
Extreme worst-case setup
Because robustness is the key when speaking about reproducibility, we stretch our attempt by assuming that all the various servers have disappeared and are unreachable.
Other said, we only assume that Software Heritage – universal software archive
– is available. Since Software Heritage removes some metadata – e.g.,
compressor information – to archive only the content, then the tarball that
Software Heritage returns does not necessary match the checksum known at
package time – because of that missing metadata. Disarchive builds a database
containing this metadata, and thus a map from this checksum to the content
stored in Software Heritage. Using this Disarchive database and the content
archived in Software Heritage, Guix is able to rebuild the exact same source
(tarball). Concretely, we disable and stop systemd-resolved.service
and
manually set on these two servers,
128.93.166.15 archive.softwareheritage.org 141.80.181.40 disarchive.guix.gnu.org
Therefore, we are going to check if the combination Software Heritage + Disarchive is operational for rebuilding from scratch more than 422 source codes. Our aim is to identify the holes for fixing them.
The first annoyance is that guix time-machine
needs an access to the
server git.savannah.gnu.org
, although the Git repository is already cloned
and already contains the required commit. For instance,
$ guix describe
Generation 25 mai 19 2023 13:30:14 (current)
guix 14c0380
repository URL: https://git.savannah.gnu.org/git/guix.git
branch: master
commit: 14c03807ba4bc81d42cf869f5b827f7da54ff843
$ guix time-machine --commit=14c0380 -- describe
guix time-machine: error: Git error: failed to resolve address for git.savannah.gnu.org: Name or service not known
$ git -C ~/.cache/guix/checkouts/pjmkglp4t7znuugeurpurzikxq3tnlaywmisyr27shj7apsnalwq \
show 14c0380 | grep commit
commit 14c03807ba4bc81d42cf869f5b827f7da54ff843
Another annoyance is about the source of some packages deep in the graph of
dependencies: ed
, gcc-core@2.95.3
, ghostscript
, guile@2.2.6
,
linux-libre
, linux-libre-headers-stripped
, mes
, mes-minimal-stripped
,
etc. as explained in this thread on guix-devel
mailing list. These
packages are part of the bootstrap and their source code is some tarball
archive located in ftp.gnu.org
. We already know that the coverage is weak
here.
In addition to the two servers allowed above, these two severs are also manually added,
209.51.188.168 git.savannah.gnu.org 209.51.188.20 ftp.gnu.org
That’s said, let’s go!
Running guix time-machine -C channels.scm
Issue 1 and 2: Options --fallabck
and --no-substitutes
We start by the simplest: run one previous version of Guix. This version is
described by the file channels.scm
. This file contains two channels at
pinned revisions. For instance, it specifies the commit
40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413
from April 24th, 2020. It means
that using the current Guix revision 14c0380
installed (pulled) on my
machine on May 19th 2023, this command about guix time-machine
should
display the help message as it was on April 24th, 2020.
$ guix time-machine --commit=40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413 -- help Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0%guix substitute: warning: ci.guix.gnu.org: host not found: Name or service not known substitute: substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0%guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known substitute: retrying download of '/gnu/store/…-config.scm' with other substitute URLs... guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known guix substitute: error: failed to find alternative substitute for '/gnu/store/…-config.scm' substitution of /gnu/store/…-config.scm failed building /gnu/store/…-config.scm.drv... guix time-machine: error: some substitutes for the outputs of derivation `/gnu/store/…-module-import-compiled.drv' failed (usually happens due to networking issues); try `--fallback' to build derivation from source
Ah, an error! This first issue is annoying and the fallback should be
transparent. Trying the recommendation --fallback
with the manual
invocation reads,
$ guix time-machine --commit=40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413 --fallback -- help [...] substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% retrying download of '/gnu/store/…-module-import-compiled' with other substitute URLs... guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known guix substitute: error: failed to find alternative substitute for '/gnu/store/…-module-import-compiled' substitution of /gnu/store/…-module-import-compiled failed guix time-machine: error: corrupt input while restoring archive from #<closed: file 7fa4f6e798c0>
Ah, another error! This second issue is also annoying because the previous
hint is inaccurate. Instead of --fallback
, let use --no-substitutes
and
now it starts… and fails but for other reasons we are going to investigate.
Issue 3: Non consistent message for substitutes and/or fallback
In addition to the Guix channel itsefl, we consider all the channels listed
by the file channels.scm
. The command-line guix time-machine -C
channels.scm
is generating the old versions of Guix itself and of the other
channels, both corresponding to the exact same state as it was on April
24th, 2020. The output should be the help message. Instead, it displays:
$ guix time-machine -C channels.scm -- help Updating channel 'guix-past' from Git repository at 'https://gitlab.inria.fr/guix-hpc/guix-past.git'... SWH: found revision 4c3923dc0114f4669fbd99c5a09a443d3eb5f4d6 with directory at 'https://archive.softwareheritage.org/api/1/directory/057e6655d7240135862f9dd7da59c75d64db34b8/' SWH vault: requested bundle cooking, waiting for completion... swh:1:rev:4c3923dc0114f4669fbd99c5a09a443d3eb5f4d6.git/ [...] Updating channel 'guix' from Git repository at 'https://git.savannah.gnu.org/git/guix.git'... WARNING: (guix build emacs-build-system): imported module (guix build utils) overrides core binding `delete' Computing Guix derivation for 'x86_64-linux'... |@ substituter-started /gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc substitute retrying download of '/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc' with other substitute URLs... guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known guix substitute: error: failed to find alternative substitute for '/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc' @ substituter-failed /gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc fetching path `/gnu/store/l1iakyjw5lacjbnynm6z7b31clyh1llx-ghostscript-9.27-doc' (empty status: '') Backtrace: 13 (primitive-load "/gnu/store/hbnmq1p5pnj55id6547h2bhvs15z4lg2-compute-guix-derivation") In ice-9/eval.scm: 155:9 12 (_ _) 159:9 11 (_ #(#(#(#(#(#(#(#(#(#(#(#(#(#<directory (guile-user) 7f5082df7c?> ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?) ?)) In ./guix/store.scm: 1975:24 10 (run-with-store #<store-connection 256.99 7f5082e39460> #<procedure 7f5074b81980 at ./guix/self.scm:11?> ?) 1811:8 9 (_ #<store-connection 256.99 7f5082e39460>) In ./guix/gexp.scm: 961:2 8 (_ #<store-connection 256.99 7f5082e39460>) 821:2 7 (_ #<store-connection 256.99 7f5082e39460>) In ./guix/store.scm: 1859:12 6 (_ #<store-connection 256.99 7f5082e39460>) 1312:5 5 (map/accumulate-builds #<store-connection 256.99 7f5082e39460> #<procedure 7f5074c2a540 at ./guix/stor?> ?) 1323:15 4 (_ #<store-connection 256.99 7f5082e39460> ("/gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscrip?" ?) ?) 1323:15 3 (loop #f) 711:11 2 (process-stderr #<store-connection 256.99 7f5082e39460> _) In ./guix/serialization.scm: 101:11 1 (read-int #<input-output: file 10>) 79:6 0 (get-bytevector-n* #<input-output: file 10> 8) ./guix/serialization.scm:79:6: In procedure get-bytevector-n*: ERROR: 1. &nar-error: file: #f port: #<input-output: file 10> guix time-machine: error: You found a bug: the program '/gnu/store/hbnmq1p5pnj55id6547h2bhvs15z4lg2-compute-guix-derivation' failed to compute the derivation for Guix (version: "40fd909e3ddee2c46a27a4fe92ed49d3e7ffb413"; system: "x86_64-linux"; host version: "14c03807ba4bc81d42cf869f5b827f7da54ff843"; pull-version: 1). Please report it by email to <bug-guix@gnu.org>.
Before the backtrace, it is awesome! The Git repository of the channel
guix-past
is unreachable but the content is available in Software Heritage.
Transparently, Guix clones from Software Heritage. Great job!
However, this third issue is a bit more cryptic than previously. Well, let
try the option --fallback
… similar error. So, let try the option
--no-substitutes
.
Issue 4: Missing source in Software Heritage and/or Disarchive database
Now it almost passes – needing the server ftp.gnu.org
as noted above. It
still fails because linux-libre-4.19.56-gnu.tar.xz.drv
cannot be built,
other said, the source
("url", "(\"https://linux-libre.fsfla.org/pub/linux-libre/releases/4.19.56-gnu/linux-libre-4.19.56-gnu.tar.xz\" \"ftp://alpha.gnu.org/gnu/guix/mirror/linux-libre-4.19.56-gnu.tar.xz\" \"mirror://gnu/linux-libre/4.19.56-gnu/linux-libre-4.19.56-gnu.tar.xz\")")
is unreachable. Let temporarily setup the network, and build this derivation,
$ guix build /gnu/store/qwbmqzyqv8nl39pkmzyp268lcnjrhrvs-linux-libre-4.19.56-gnu.tar.xz.drv 101,7 MB will be downloaded: /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz substituting /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz... downloading from https://ci.guix.gnu.org/nar/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz ... linux-libre-4.19.56-gnu.tar.xz 96.9MiB 2.4MiB/s 00:40 ▕██████████████████▏ 100.0% /gnu/store/ap6nhyxjy61pmnjph4xbj3bdjx7m1zj2-linux-libre-4.19.56-gnu.tar.xz
and run again the guix time-machine --no-substitutes
command line. It still
fails because nyacc
(mirror://savannah/nyacc/nyacc-0.86.0.tar.gz
) is
missing. The source of net-tools-1.60-0.479bb4a.zip
is also missing.
And the ones of guile-git-0.3.0.tar.gz
, guile-json-3.2.0.tar.gz
,
libuv-v1.30.1.tar.gz
, rhash-1.3.8.tar.gz
, scons-3.0.4-checkout
,
zstd-1.4.2.tar.gz
, doxygen-1.8.15.src.tar.gz
, flake8-3.7.7.tar.gz
,
hypothesis-4.18.3.tar.gz
, more-itertools-7.1.0.tar.gz
,
pluggy-0.11.0.tar.gz
, pytest-4.4.2.tar.gz
(why Python packages are they
required for building Guile program?), fonttools-3.38.0.zip
,
gobject-introspection-1.60.2.tar.xz
, pbr-3.0.1.tar.gz
,
selinux-20170804-checkout
, yelp-tools-3.28.0.tar.xz
,
yelp-xsl-3.32.1.tar.xz
, po4a-0.57.tar.gz
, fontforge-20190801.tar.gz
,
libspiro-dist-0.5.20150702.tar.gz
, libuninameslist-dist-20190701.tar.gz
,
ruby-2.5.3.tar.xz
, teckit-2.5.9.tar.gz
, texlive-20180414-extra.tar.xz
.
Again, let temporarily build the derivations and repeat. Another source is
missing: static-binaries
. The fourth issue is about holes in Software
Heritage and Disarchive coverage. Please note that's few holes compared to
the hundreds of required source code.
Issue 4 bis: Missing support of Subversion as Software Heritage fallback
Guix is not able to use Software Heritage when the version control system of the source code is Subversion. It’s known and we hit it!
svn: E170013: Unable to connect to a repository at URL 'svn://www.tug.org/texlive/tags/texlive-2018.2/Master/texmf-dist/source/generic/hyph-utf8' svn: E670003: Unknown hostname 'www.tug.org' Backtrace: 2 (primitive-load "/gnu/store/dkx38h7m7c4gani34y025gcq8ym?") In guix/build/svn.scm: 39:2 1 (svn-fetch _ _ _ #:svn-command _ #:recursive? _ # _ # _) In guix/build/utils.scm: 652:6 0 (invoke _ . _) guix/build/utils.scm:652:6: In procedure invoke: Throw to key `srfi-34' with args `(#<condition &invoke-error [program: "/gnu/store/mk7hgz801cv730gfx63mv8z9wjzfs0jb-subversion-1.10.6/bin/svn" arguments: ("export" "--non-interactive" "--trust-server-cert" "-r" "49435" "svn://www.tug.org/texlive/tags/texlive-2018.2/Master/texmf-dist/source/generic/hyph-utf8" "/gnu/store/082v60by6rf5y0ai2jda9jv5bffdlcri-hyph-utf8-scripts-49435-checkout") exit-status: 1 term-signal: #f stop-signal: #f] 7ffff0014f80>)'. builder for `/gnu/store/ik58fsf7j2h2n19p1hk422a7hvizj2pa-hyph-utf8-scripts-49435-checkout.drv' failed with exit code 1
Therefore, we add 46.4.94.215 www.tug.org
as the allowed network. It eases
the download all the source of TeXlive packages required by the documentation.
Issue 5: Bootstrapping
Finally Guix starts building from the bootstrap. But it fails with the
test suite of tcc-boot0-0.9.26-6.c004e9a.drv
,
starting phase `check' t: [FAIL] 02-return-1: [FAIL] 05-call-1: [FAIL] 07-include: [FAIL] 54-argc: [FAIL] 70-strchr: [FAIL] 91-fseek: [FAIL] 92-stat: [FAIL] 99-readdir: [FAIL] 22_floating_point: [FAIL] 23_type_coercion: [FAIL] 24_math_library: [FAIL] 34_array_assignment: [FAIL] 49_bracket_evaluation: [FAIL] 55_lshift_type: [FAIL] expect: 14 failed: 15 passed: 209 total: 224 FAILED: 15/224 command "sh" "check.sh" failed with status 1
As a workaround, we fetch the substitutes for this derivation. Repeat the
same command-line. Now, it fails about diffutils-mesboot-2.7.drv
, then
binutils-mesboot0-2.20.1a.drv
, then gcc-core-mesboot-2.95.3.drv
, then
glibc-mesboot0-2.2.5.drv
, then gcc-mesboot0-2.95.3.drv
, then
binutils-mesboot-2.20.1a.drv
, etc. Well, we stop here. The fifth issue
is about bootstrapping: it is not robust.
In order to bypass, setting on the substitutes using the network, these commands are run,
guix build /gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscript-9.27.drv --no-grafts guix build /gnu/store/36fgj9n3c8bmix2pd12kfaszi7bd5y7a-ghostscript-9.27.drv --no-grafts --check
which allow to populate the store.
Issue 6: Uncovered patches as source
Continuing the same procedure, we get this sixth issue:
failed to download "/gnu/store/i3avflhlz20ampw6v21s0wmqx0527xyi-icu4c-datetime-regression.patch" from "https://github.com/unicode-org/icu/commit/7788f04eb9be0d7ecade6af46cf7b9825447763d.patch"
Similarly, the derivation icu4c-64.2.drv
is built and checked. Then, again a
very similar issue is hit with icu4c-datetime-regression.patch
,
icu4c-locale-mapping.patch
and we manually run the derivation
icu4c-datetime-regression.patch.drv
, icu4c-locale-mapping.patch.drv
.
This issue is fixed by patch#62036 for the current Guix revisions but not yet for the past ones. The Guix way for feeding the Software Heritage archive should be improved here.
Issue 7: Time bomb
The seventh issue is a time bomb with the package gnutls-3.6.A
. The test
suite fails,
./scripts/common.sh: line 81: datefudge: command not found You need datefudge to run this test SKIP gnutls-cli-invalid-crl.sh (exit status: 77)
because the current time (June 2023) is unexpected by the test suite from 2020. Let locally and temporarily reset the time,
sudo timedatectl set-ntp false
sudo timedatectl set-time '2020-06-23 00:00:00'
and then reset back the local time to the current one after building the derivation.
$ guix build /gnu/store/qm3l79ic89qpjjd8avqxd81425v4wvv5-gnutls-3.6.A.drv --no-substitutes -q /gnu/store/rvs9n58xvz6xpk4ri658shcj3h9kznvy-gnutls-3.6.A-debug /gnu/store/467xibzigp01g79vj11r3xycyjkwiq42-gnutls-3.6.A-doc /gnu/store/zr6i9jnfv2sw00r59kdpk2jgkj98k3rp-gnutls-3.6.A $ sudo timedatectl set-ntp true
The same error happens for openssl-1.1.1g
and the same workaround works.
The test suite of libgit2
fails with:
1) Failure: refs::revparse::date [/tmp/guix-build-libgit2-1.0.0.drv-0/libgit2-1.0.0/tests/refs/revparse.c:31] Function call succeeded: error no error, expected non-zero return
and the same workaround temporarily changing the local time allows to build
libgit2
.
In summary, we hit 3 time bombs from the test suites.
Issue 8: Hash mismatch between Guix and Software Heritage normalization
The eight issue is about a hash mismatch for lz4-1.9.2-checkout
. It means
that the content archived in Software Heritage is different from the content
used at package time.
Trying to download from Software Heritage... SWH: found revision fdf2ef5809ca875c454510610764d9125ef2ebbd with directory at 'https://archive.softwareheritage.org/api/1/directory/8c4c3cacf90599887a5b02a46ec6f052f4422ef0/' swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/ swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/ swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/config.yml tar: swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0/.circleci/config.yml: time stamp 2023-06-26 16:34:17 is 2854.263082558 s in the future [...] tar: swh:1:dir:8c4c3cacf90599887a5b02a46ec6f052f4422ef0: time stamp 2023-06-26 16:34:20 is 2857.251556658 s in the future r:sha256 hash mismatch for /gnu/store/asvjidjr20hniips512mva8jrfd2zmy0-lz4-1.9.2-checkout: expected hash: 0lpaypmk70ag2ks3kf2dl4ac3ba40n5kc1ainkp9wfjawz76mh61 actual hash: 0nygwna2sqa5jbsj51m6v5jznkgvwprkkznpdghc7y736fbq18lj hash mismatch for store item '/gnu/store/asvjidjr20hniips512mva8jrfd2zmy0-lz4-1.9.2-checkout'
The issue is probably related to bug#61910 about CR/LF (end of line). Other
said, Software Heritage applies normalization and thus Guix must restore the
state without this normalization, generally using the file .gitattributes
.
That behaviour was not implemented back on 2020 and some corner cases must be
carefully checked.
Running -- build -f guix.scm
Now we have the exact same old version of Guix specified by the file
channels.scm
, we are able to use it in order to build the file guix.scm
.
Issue 2: Option --no-substitutes
The command-line reads,
$ guix time-machine -C channels.scm --no-substitutes -- build -f guix.scm guile: warning: failed to install locale substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0%guix substitute: warning: ci.guix.gnu.org: host not found: Name or service not known substitute: substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0%guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known substitute: The following derivations will be built: /gnu/store/zyaczc476w26yqirl2gvg93j1k53mggp-wget-1.20.3.drv /gnu/store/2cpii8d00p3apcwsbdii1afhflk34g8j-perl-uri-1.76.drv /gnu/store/3v2figjf83iyll9wgzxqwgrri78fjqzk-URI-1.76.tar.gz.drv /gnu/store/wbz3g7f5i6y4rbwvy6zl8mnfavv7mq4p-perl-test-needs-0.002005.drv /gnu/store/l0vmj9qbsg79xcmdzzx2zdkn7n0x52h7-Test-Needs-0.002005.tar.gz.drv /gnu/store/4jskgq9bjm7dads1rwy0crfhn9mni890-perl-http-date-6.05.drv /gnu/store/6ksf611qw3v3bgqwm25jz3b5pp5dmmwr-wget-1.20.3.tar.lz.drv /gnu/store/6qws54m0gwlkkzy7a0p8iqpa5vdcmr64-perl-io-socket-ssl-2.066.drv /gnu/store/pyzm2z8vhin0h8xgvjwfkl61njvqmx1n-perl-net-ssleay-1.88.drv /gnu/store/b355vzfv85rpqc6idiyz8m9wa0maymk7-Net-SSLeay-1.88.tar.gz.drv /gnu/store/w1fj4ryrrhj5v16w62gkba1gnil7h6jm-IO-Socket-SSL-2.066.tar.xz.drv /gnu/store/988raa22zg4gz08n36a92q3cfx13176g-IO-Socket-SSL-2.066.tar.gz.drv /gnu/store/akcazb652j2kw67cgyq1g04qjcgd1927-perl-http-daemon-6.01.drv /gnu/store/144ak25554ck634b34va8jsay61z7yjg-HTTP-Daemon-6.01.tar.gz.drv /gnu/store/i8chh5qdhzna6qn097hafyz0vixilljk-perl-lwp-mediatypes-6.04.drv /gnu/store/b6v1cmfifbzg9r4z644b6gimg779nd55-perl-test-fatal-0.014.drv /gnu/store/zas9c8sywjr1nlafj9847y5sylaxlfiz-perl-try-tiny-0.30.drv /gnu/store/vvzg062yk4v83kv49w1zm66ga6jgj0pn-perl-http-message-6.18.drv /gnu/store/w1lcr2cyv45wmhk37q4rp7vb7faq36bz-perl-encode-locale-1.05.drv /gnu/store/xkpxa5f7w1v5cyqj57935cq80nhd701x-perl-io-html-1.00.drv /gnu/store/j4qazlzhica2azkb8z942086c777j8nd-libpsl-0.21.0.drv /gnu/store/cxivzld51l9f3zljgwlff77c1d0wz2if-libpsl-0.21.0.tar.gz.drv /gnu/store/qi2abjyqrvppi7xxp78xcn5lscgkp8nc-libchop-0.0.2006-0.feb8f6b.drv /gnu/store/16ag6jqwk9q4kw35alwiskhhz6f09xzd-guile-1.8.8.drv /gnu/store/0al2zprwkynq1mcjhxxzazfl5306f9x1-guile-1.8.8.tar.xz.drv /gnu/store/w9sf7zgw7cqanf5pjj073zp98dc2wlpl-guile-1.8.8.tar.gz.drv /gnu/store/1pxzlf7239d976zksg3z1a7djfxjj7v8-bdb-6.2.32.drv /gnu/store/1py5rdgmapdw7xqv5w5z1dk4l9hw6cwh-db-6.2.32.tar.gz.drv /gnu/store/89q7yf6jsyhhc0m3zfgyaxg09lfhcpy9-libtool-1.5.22.drv /gnu/store/rk31bw6vah9cigjsgfz7zp7az474v0d9-libtool-1.5.22.tar.gz.drv /gnu/store/8j1w842q07m8l8iw089b3x0kjkhvcz32-rpcsvc-proto-1.4.drv /gnu/store/8qvgp5z785mcw61h4afnhp6mah5hy967-automake-1.9.6.drv /gnu/store/77iiskpyhkslldqmq6cffl3lkdp67f0l-automake-1.9.6.tar.gz.drv /gnu/store/axfh3j24c0xhna3pwsp0d87kwr4wi23a-autoconf-2.59.drv /gnu/store/76dfv649gmpmq4a1bfy37bdsz8pzc5nr-autoconf-2.59.tar.gz.drv /gnu/store/f1qc11kclgkjl2jazszbjs2ilmii6ycl-libtirpc-minimal-1.2.5.drv /gnu/store/hb0d7j0jyhjzxy5fkipb3hjc19gcpp37-tdb-1.4.3.drv /gnu/store/j6pr7bxypknbj77k4ha59gm4plrml8vv-gperf-3.0.4.drv /gnu/store/hpf0ks26v7h0i44cczhbl75ms832qndk-gperf-3.0.4.tar.gz.drv /gnu/store/kb9j26232db2bdx5s52dwm3la7ygam25-libchop-0.0.2006-0.feb8f6b.tar.xz.drv /gnu/store/62lm0pq5vr2fya7brjh2lsdppvi8fihq-libchop-0.0.2006-0.feb8f6b-checkout.drv /gnu/store/9c7115yfbdg9bk5ywkn44rph0vq550rj-module-import.drv /gnu/store/6ax9kkgagww8xm11554rfvpy346j5vhh-config.scm.drv /gnu/store/9zyvbxa9i5q3winl1x6fjsdfkil94za7-module-import-compiled.drv /gnu/store/ly9ymvzh2lg39z10xbhd9wbjqmv0sg9m-e2fsprogs-1.45.6.drv /gnu/store/696rz8qbnk62igvna9bdzi25k9z9gkf8-e2fsprogs-1.45.6.tar.xz.drv /gnu/store/jyr2a8v4149zc4frqpa780cmf1s9dh6z-procps-3.3.16.drv /gnu/store/bdawiywqk44kq93gq9dxm44bvk370b03-procps-ng-3.3.16.tar.xz.drv /gnu/store/mjk671mfgrfhbssf4xl0alnscw9p4lnc-g-wrap-guile18-1.9.7.drv /gnu/store/j5fzdhfgifwk6n4y22afq813s5panbi1-g-wrap-1.9.7.tar.gz.drv /gnu/store/r7gggq1z75s23kif6pypc2w5vvz44pfp-guile1.8-lib-0.1.3.drv /gnu/store/higni0l3kyg40achyq9zz7dzzn82z1w0-guile1.8-lib-0.1.3-checkout.drv /gnu/store/w26azcw027pd1x252pc3v0kv0p91l843-texinfo-4.13a.drv /gnu/store/l5bp1h43s34h17mrcdzqhqnbp5y8ixjy-texinfo-4.13a.tar.lzma.drv 1,6 MB will be downloaded: /gnu/store/mppxcmw3iwcl3kd5azr48m5858nqb2f6-tdb-1.4.3.tar.gz /gnu/store/bg6jwbml0h7k3da69asqddw4ciy7hkq5-libtirpc-1.2.5.tar.bz2 /gnu/store/hk0lfyvkv0721g5nc347r4g9jh3rczbv-rpcsvc-proto-1.4.tar.xz /gnu/store/i4z59xblbn891z4y202nxvxy7y6hyb6a-IO-HTML-1.00.tar.gz /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz /gnu/store/9ii5xjpwbkq047n9p4gyz0scj9r1h6wv-HTTP-Message-6.18.tar.gz /gnu/store/2jqvj76fy8rrsi5vnal381hid4hdy170-LWP-MediaTypes-6.04.tar.gz /gnu/store/0llcr023q9dxdkr685kdc7nlq0ppsm89-Try-Tiny-0.30.tar.gz /gnu/store/56qijsj09i9awkzpd1masb0mq43bcfz5-Test-Fatal-0.014.tar.gz /gnu/store/wm3l2cblmzry265v849g36f23ms829qh-HTTP-Date-6.05.tar.gz substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% [...] substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% substituting /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz... substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% [...] substitute: updating substitutes from 'https://ci.guix.gnu.org'... 0.0% substitute: updating substitutes from 'https://bordeaux.guix.gnu.org'... 0.0% retrying download of '/gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz' with other substitute URLs... guix substitute: warning: bordeaux.guix.gnu.org: host not found: Name or service not known guix substitute: error: failed to find alternative substitute for '/gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz' substitution of /gnu/store/a0jdvghwf2pjvbn3mm8jgzlkmbr9mr0w-Encode-Locale-1.05.tar.gz failed guix build: error: corrupt input while restoring archive from #<closed: file 7f3ea3ae9070>
with the same error as previously. We need to pass the option
--no-substitutes
to the build
action.
Issue 4: Missing source in Software Heritage and/or Disarchive database
These packages are missing g-wrap-1.9.7.tar.gz
, libtirpc-1.2.5.tar.bz2
,
tdb-1.4.3.tar.gz
, guile-cairo-1.10.0.tar.gz
, guile-lib-0.2.6.1.tar.gz
and guile-charting-0.2.0.tar.gz
. For all of them, the content seems stored
in Software Heritage but because they are compressed archives, their
intrinsic identifier (integrity checksum) is not enough here for finding
back this content.
Consider one specific example. The paper requires the Lout batch document
formatter at various versions: from 3.20
to 3.29
. All these ten versions
are compressed tarballs. What is interesting is that, for 8 of them, the
compressed tarball is downloaded from Software Heritage without any specific
problem. For instance, version 3.21
,
Starting download of /gnu/store/85rzg535gscw7sf6q4wrlnq1yq3v0xzk-lout-3.21.tar.gz From https://archive.softwareheritage.org/api/1/content/sha256:098467c7f747cf5bd1cf966270384d0c3f8c795b843cbb0304728e118909b7ce/raw/... downloading from https://archive.softwareheritage.org/api/1/content/sha256:098467c7f747cf5bd1cf966270384d0c3f8c795b843cbb0304728e118909b7ce/raw/ ... raw/ 1.7MiB 10.3MiB/s 00:00 ▕██████████████████▏ 100.0% successfully built /gnu/store/9cip8x8hql06mw2vaj0vh59k8ka435cr-lout-3.21.tar.gz.drv
However, for the versions 3.20
and 3.28
, it fails with:
Starting download of /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz From https://archive.softwareheritage.org/api/1/content/sha256:af62b850b8b410d427049f1152fa0217fc7ed77d4cd3ec73e0f30e2aa644926b/raw/... download failed "https://archive.softwareheritage.org/api/1/content/sha256:af62b850b8b410d427049f1152fa0217fc7ed77d4cd3ec73e0f30e2aa644926b/raw/" 404 "Not Found" Starting download of /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz From https://web.archive.org/web/20230630183701/http://download.savannah.gnu.org/releases/lout/lout-3.20.tar.gz... In procedure getaddrinfo: Name or service not known Trying to use Disarchive to assemble /gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz... could not find its Disarchive specification failed to download "/gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz" from "mirror://savannah/lout/lout-3.20.tar.gz" builder for `/gnu/store/y8lg3kdd12if1rnhfqdipcmrg16n98lj-lout-3.20.tar.gz.drv' failed to produce output path `/gnu/store/yp7vj9hgzqz92vhw0wi17nl78m722bzw-lout-3.20.tar.gz' build of /gnu/store/y8lg3kdd12if1rnhfqdipcmrg16n98lj-lout-3.20.tar.gz.drv failed
Other said, it is not straightforward to have the guarantee for a robust coverage with a fallback to a supported archive as Software Heritage. It is where Disarchive shines and needs a lot of love! Here the Disarchive specification is missing, hence the hole.
Not an issue: access to the data
All the previous Lout versions are considered as data for the paper. The
paper also consider other data as ten .ogg
files. These files are not
content-addressed (intrinsic identifier) although the paper provides integrity
checksum. Other said, if this URL is gone then the result of the paper could
not be verified. Well, that’s another story.
And we are done!
Yeah, we get the report that looks very similar as the PDF. Awesome!
Jun the fun, join Guix in scientific context!
Again, we did a tour around the pieces which need some love; we have focused on the broken corner cases because they are visible. Please note all the other is invisible and just runs out of the box with Guix. That’s impressive!