Compiler and correctness
Back from the Open Source Experience 2021 meeting, I got the chance to learn more about Pharo – a pure object-oriented programming language and a powerful environment à la Smalltalk – and about Coq – a formal proof management system. Even, I had the opportunity to discuss with Yves Bertot and I had been so lucky that he explained me the demo one-to-one. To be honnest, I am still rehashing many concepts and this experience is feeding my willing to learn more.
Once at home, I want to try by myself. Ah, because I am running Guix as a computational environment manager, the story starts with a yak shaving: close this old bug#46016 about Proof General. Other said, I want to interact with Coq from my regular text editor Emacs. Therefore, let fix the annoyance with patch#51755. And, since we are at it, although it is not necessary, let also add the Coq package semantics with the patch#51896. Now we are ready, let's go!
Our aim with this post is to sketch some ideas about compilation and correctness using a very basic --not to say trivial_– toy implementation. How to test the implementation? How to know the implementation is correct? And we get it, our aim is to use Coq.
Because we want to keep everything as simple as possible, the compiler is
about the simplest language ever: a calculator computing only additions but
with parenthesis, e.g., (1+2)+(3+(4+0))
. Again, keeping everything as
simple as possible, we are using the simplest computational model: a stack
machine with two instructions (PUSH
and ADD
). Therefore, we need:
- a parser reading the input and returning a tree,
- a compiler transforming the tree to instructions for the stack machine,
- an emulator running the stack machine.
Our didactic goal
In our very restricted toy framework, the pipeline of a computation is defined by this composition –data flowing from right to left–,
pop . emulator . compiler . push
where
push
returns a tree,compiler
transforms the tree to a sequence of instructions,emulator
executes this list of instructions,pop
returns the last elements; we expect one value.
In this post, we are going to implement various emulator
and compiler
.
The aim is to discuss how it is possible to ensure the correctness of the
composition emulator . compiler
.
Implementation using Haskell
We choose Haskell and not OCaml – only because Tuareg (Emacs OCaml mode) is not working for us as expected. Anyway, let start!
Parsing with Haskell
Using the builtin Haskell parsing utilities, the parser
function reads,
parser :: String -> Maybe STree parser input = case parse expr "" input of Left _ -> Nothing Right e -> Just e
where expr
of type Parser STree
. Well, please read all the details in the
file Parser.hs
. Keeping simplicity in mind, the data type for the tree is
just:
data STree = Val Integer | Add STree STree
where the leaf is named Val
and carries the value of an integer; the
constructor Add
is a node and by referring recursively to STree
it defines
the tree . For instance, (1 + 2) + 3
is thus represented by the tree,
Add / \ Add Val 3 / \ Val 1 Val 2
where Val
are leaves and Add
are nodes. So far, so good!
Emulator
The emulator (virtual machine) runs two instructions (PUSH
and ADD
), the
binary, here, is a list of instructions, and the stack is a list of integers.
The Haskell code implementing these words reads,
data Instr = PUSH Integer | ADD type Binary = [Instr] type Stack = [Integer]
Therefore, the emulator takes a binary, returns a stack and it executes (computes) starting with an empty stack,
emulator :: Binary -> Stack emulator b = exec b []
For the interested reader, the details of exec
is provided here.
Compiler
The input of the compiler a tree STree
and it produces a binary (a list of
instructions), i.e., it reads,
compiler :: STree -> Binary compiler source = comp source []
where comp
is an auxiliary function dealing with each case of the tree: if
it is a leaf (Val
), then the instruction is to PUSH
the value in the stack
(bytecode
), otherwise if it is a node (Add
), then compile recursively the
subtrees and set the instruction ADD
to stack. These words are implemented
in Haskell with,
comp :: STree -> Binary -> Binary comp (Val n) bytecode = PUSH n : bytecode comp (Add x y) bytecode = comp x $ comp y $ ADD : bytecode
Nothing fancy!
And…
…to complete a working example, the functions push
and pop
are here the
identity function (do nothing) and take the first element. Well, adding some
other minor code,
cmd
for the command line interface and parsing the input string,go
for gluing the compilation and running parts,emit
for debugging purpose,
and we end with,
$ ./calc-no-coq -x '(1+2)+3' = 6 $ ./calc-no-coq -c '(1+2)+3' Instructions: "PUSH 1, PUSH 2, ADD, PUSH 3, ADD, ;"
Neat!
Implementation using Coq
As you see just above, we named the previous calc-no-coq
because the title
of this post is about «compilation and correctness», thus now we are going to
speak about correctness. The perfect tool is Coq the proof assistant. Before
proving, let use Coq as a programming language and implement these compiler
and emulator in Coq, then extract to Haskell. Other said, we reuse the same
pipeline as before,
pop . emulator . compiler . push
where emulator
and compiler
are going to be implemented in Coq, the pop
and push
will be adapted accordingly, i.e.,
push
converts from the Haskell tree datatypeSTree
to the Coq tree datatypeTree
, i.e, signatureSTree -> Coq.Tree
,pop
converts from the Coq stack datatype to Haskell integer, i.e., signatureCoq.Stack -> Integer
This Coq.Tree
is defined by the module Lang
,
Inductive Tree : Type := Val (n : nat) | Add (x : Tree) (y : Tree).
which looks similar to definition of STree
, and Coq.Stack
is identically
defined as above, i.e., a list of number,
Definition Stack := list nat.
So far, so good!
Emulator
Similarly to the previous implementation, we straightforwardly adpat the data structure for the instructions and the binary (list of instructions),
Inductive Instr:Type := PUSH (n : nat) | ADD. Definition Binary := list Instr.
The emulator executes a bytecode (b : Binary
) and returns a stack,
Fixpoint exec (b : Binary) (s : Stack) : Stack := match b, s with PUSH n :: rest, stack => exec rest (n :: stack) | ADD :: rest, m :: n :: stack => exec rest (m + n :: stack) | [], stack => stack | _, stack => stack end. Definition emulator (b : Binary) := exec b [].
where exec
is the function really computing; this function takes two
arguments: a list of instructions (Binary
) and stack and then returns
another stack.
Compiler
The compiler requires a tree (named above Coq.Tree
) and an auxiliary function,
Fixpoint comp (source : Tree) (bytecode : Binary) : Binary := match source with Lang.Val n => PUSH n :: bytecode | Lang.Add x y => comp x (comp y (ADD :: bytecode)) end. Definition compiler (source : Tree) := comp source [].
where all is strongly similar with the previous implementation.
And…
…to a complete working example, we just need to extract from Coq to Haskell with,
Require Export Lang Compiler Emulator. Require Extraction. Extraction Language Haskell. Extraction "Generated.hs" compiler emulator.
additing some Haskell glue code, as push
and pop
, and command-line and
parser, we get,
$ ./calc -x '(1+2)+3' = 6 $ ./calc -c '(1+2)+3' Instructions: "PUSH 1, PUSH 2, ADD, PUSH 3, ADD, ;" $ ./calc -c '1+(2+3)' Instructions: "PUSH 1, PUSH 2, PUSH 3, ADD, ADD, ;"
Note the order of the instructions depending on the parenthesis order, i.e., depending on the tree, for instance,
(1+2)+3 1+(2+3) Add Add / \ / \ Add Val 3 Val 1 Add / \ / \ Val 1 Val 2 Val 2 Val 3
Neat! However, let avoid to go out of rail and thus let recap!
- we are interested by the
emulator
andcompiler
parts, - for both implementations, the command-line is done by the same Haskell
implementation (
cmd
); idem for parsing (parser
).
Now, we want to have a proof that the implementations of emulator
and
compiler
are correct. For the implementation in Haskell, we cannot do much,
except run QuickCheck (large number of examples picked randomly). For the
implementation in Coq, well Coq is a proof assistant after all, so could we
prove something about the correctness?
Proving the emulator
and compiler
are correct
In place of a large test suite, we can reason about the logic of the functions
emulator
and compiler
. As we wrote in Test = examples or theorem?,
instead of ruling out many examples to convince us about the correctness, we
could try to prove a theorem. Which one?
We would like to prove that compiling the code and running it using the emulator is equal to interpret it using an (abstract) evaluator. It means prove this theorem,
Theorem correctness: forall (code : Tree), emulator (compiler code) = eval code :: [].
where eval
is an abstract interpreter implemented in Coq but never
effectively doing a single real computation. Such interpreter is easily
defined by the recursive function,
Fixpoint eval (source:Tree) := match source with Val n => n | Add x y => (eval x) + (eval y) end.
A formal proof of the theorem correctness
requires other theorems, lemma or
corollary. Note that for Coq, theorem, lemma and corollary are the same but
they help the human reader.
How do we prove this theorem correctness
assisted by Coq?
Unwrap the emulator
The emulator is a layer on the top of the execution model – the function
exec
. Instead of the theorem correctness
, we could think about the
theorem,
Theorem correctness': forall (code : Tree) (stack : Stack), exec (compiler code) stack = eval code :: stack.
and if this theorem correstness'
holds, then the proof of the theorem
correctness
reads,
Corollary correctness'': forall (code : Tree), exec (compiler code) [] = eval code :: []. Proof. intro code. rewrite correctness'. reflexivity. Qed. Theorem correctness: forall (code : Tree), emulator (compiler code) = eval code :: []. Proof. intro code. unfold emulator. rewrite correctness''. reflexivity. Qed.
where the corollary is a straightforward application of the (admitted for now)
theorem. Then, the proof of correctness
just unfolds the function of
interest emulator
and applies (rewrite
) the previous proved corollary.
How do we prove this theorem correctness'
?
Unwrap the compiler
The correctness is now dependent on the theorem correctness'
. Because
compiler
is a wrapper around the function comp
, we could think about the
lemma,
Lemma correctness_comp: forall (code : Tree) (bytecode : Binary) (stack : Stack), exec (comp code bytecode) stack = exec bytecode (eval code :: stack).
The key point of the proof is the use of the additing commutativity proof
Nat.add_comm
, \(\forall n, m : n + m = m + n\). Such could appear
straightforward, but it is not because it depends on how the numbers are
internally represented. For efficiency purpose, a computer somehow represents
only a finite set of natural numbers, commonly named integer. Instead Coq is
using, by default, the mathematical construction of natural numbers defining a
zero, noted O
and a successor, i.e., the data structure,
Inductive nat : Type := O | S n.
where e.g., 3 is thus S (S (S 0))
. That's why, the proof applies
Nat.add_comm
. Aside this consideration, the proof mainly unfolds and folds
the definitions; see below.
Using this correctness_comp
lemma, then the corollary correctness_comp'
just considers the case of an empty binary,
Corollary correctness_comp': forall (code:Tree) (stack:Stack), exec (comp code []) stack = eval code :: stack.
And…
…ending the proof is just unfolding the functions of interest and applying
(rewrite
) the previous proved results. The complete proof of correctness is
therefore (presented linearly as in math textbook),
Lemma correctness_comp: forall (code:Tree) (bytecode:Binary) (stack:Stack), exec (comp code bytecode) stack = exec bytecode (eval code :: stack). Proof. intros code. induction code. - simpl. reflexivity. - unfold comp. fold comp. intros bytecode stack. rewrite IHcode1. rewrite IHcode2. unfold eval. fold eval. unfold exec. fold exec. rewrite Nat.add_comm. reflexivity. Qed. Corollary correctness_comp': forall (code:Tree) (stack:Stack), exec (comp code []) stack = eval code :: stack. Proof. intros code stack. rewrite correctness_comp. unfold exec. reflexivity. Qed. Theorem correctness': forall (code:Tree) (stack:Stack), exec (compiler code) stack = eval code :: stack. Proof. intros code stack. unfold compiler. rewrite correctness_comp'. reflexivity. Qed. Corollary correctness'': forall (code:Tree), exec (compiler code) [] = eval code :: []. Proof. intro code. rewrite correctness'. reflexivity. Qed. Theorem correctness: forall (code:Tree), emulator (compiler code) = eval code :: []. Proof. intro code. unfold emulator. rewrite correctness''. reflexivity. Qed.
Done! Well, almost. We could also prove that the behaviour of the
interpreter eval
is correct, i.e., the properties of the addition for
mathematical intergers are fully satisfied.
Opinionated afterword
A Coq's proof is somehow a compilation. Therefore, similarly as Thompson's Reflections on Trusting Trust, what is the seed we have to trust? How to inspect this seed? What is the size / complexity of this seed?
For sure, realistic compilers usable for critical embedded software using formal verification are around, as CompCert – incredible piece!
Last, whatever the trust we have in the compiler, it runs on the top of a kernel (operating system). What trust do we have in this binary kernel? How this binary kernel is it built? Maybe, all the bootstrap chain could starts from stage0 (Hex0) on some formally verified microkernel as seL4 or on a Linux kernel compiled by CompCert from GNU Hurd microkernel1.
Footnotes:
Assuming Coq would run Hurd. ;-)