This section describes how to bridge the Tezos world (and the Michelson language) with the formal world of Coq. For this, we are going to model a theorem representing a smart contract and its purpose (i.e. the goal of the smart contract).
This chapter is based on the Vote smart contract seen in previous modules.
The Tezos blockchain can run smart contracts using the Michelson language. Michelson is a low-level stack-based Turing-complete language that is formally proven. The proof of Michelson language is compiled in a library called Mi-Cho-Coq (check the official documentation ).
Mi-Cho-Coq is based on the Curry-Howard isomorphism (  ), which ensures the correspondence between a program and a theorem. Mi-Cho-Coq is used in the coq proof assistant to translate a Michelson script into a theorem (i.e., into its logical equivalent form). The official documentation of Coq can be found here , but we recommend easier-to-read documentation such as this introduction to Coq .
The Coq proof assistant is built upon the paradigm of calculus of constructions (as described by Thierry Coquand ). The Gallina language  (the language allowing to interact with the Coq engine) provides a syntax (Terms) for describing formal objects (like a theorem) and also provides a set of instructions (Vernacular syntax) called tactics for writing the proof of the theorem.
The formal verification of a Michelson smart contract is done by providing the proof for this theorem. Coq will perform the verification of a given proof (and its related theorem) based on the Mi-Cho-Coq proof.
The proof consists in a sequence of tactics which the Coq engine will interpret. These instructions manipulate formal expressions (following logical laws (Coq universe) and Mi-Cho-Coq definitions) to formally assert the truth of a given theorem (based on given assumptions).
Before going deeper, let's illustrate the workflow of formal verification of Tezos smart contracts in Fig. 1 below.
FIGURE 1: Proof process
Notice that the Coq language (called Gallina ) provides 2 parts:
- Terms is the specification language: for modeling logical objects (such as theorems, axioms, assumptions, logical rules, specifications)
- Vernacular is the language of commands: for writing a proof of a theorem (tactics are provided to interactively prove a theorem).
This ecosystem combines an assistant of proof (Coq) and the proof of the Michelson language (Mi-Cho-Coq) to formally verify the correctness of a theorem and its proof.
The theorem is based on
- a Michelson script representing what the smart contract does.
- post-conditions representing the rules of the smart contract in a formal form.
Formal verification of a Tezos smart contract consists in verifying formally that the execution of the Michelson script satisfies specific post-conditions.
FIGURE 2: Naive description of the theorem
This schema describes an equivalence between the execution of instructions and post-conditions (A, B, C, D). Post-conditions are rules that must be verified, but these post-conditions do not describe the whole behavior of the smart contract, only specific traits representing the intent of the smart contract.
In the following sections, we will detail how the execution of a Michelson script can be formally written and how to define post-conditions. We will then study the formal proof as a sequence of Coq tactics (cf. the Vernacular part of the Gallina language).
Tezos smart contract can be written in high-level languages (such as LIGO, SmartPy and others) but are ultimately compiled in Michelson.
A smart contract invocation requires the smart contract itself (through its address), the entrypoint that is being called (and its related arguments) and the storage state.
If all these elements are provided, the execution of the smart contract code is triggered, which will result in side-effects on the storage and optionally on the Tezos network.
FIGURE 3: Execution of an entrypoint of a smart contract triggering its code and thus side-effect on storage and Tezos network.
The entrypoint information is used to identify which portion of the code will be executed. The entrypoint arguments and the storage are used as the context of execution (i.e, the execution stack is initialized with arguments and a storage). The execution of the code produces a new storage state and operations. The operations produced by this invocation are some new invocations of other smart contracts.
Now let's see how to formulate formally the execution of the Michelson script.
As we have seen, the the execution of the Michelson script produces a new storage state (we consider there are no operations produced).
So, formally speaking:
The execution of code is done by evaluating a sequence of Michelson instruction for a given initial stack (an
eval function is provided by Coq). The execution of code also requires a context and a quantity of gas to be able to execute each instruction (requirement defined by Mi-Cho-Coq). So the execution of code can be formalized as:
fuelrepresents the quantity of gas.
envrepresents a context of evaluation for the Coq engine.
evalrepresents an evaluator which effectively executes each instruction sequentially on the provided initial stack.
argumentsrepresents the parameter (entrypoint) and its related arguments.
storagerepresents the storage state before the execution.
newstoragerepresents the resulting storage after the execution.
The theorem can be formalized as:
<=> represents an equivalence
Now let's see how to define post-conditions.
Post-conditions are logical assertions that model the intention of the smart contract. In other words, post-conditions are logical expressions defining constraints to verify on storage data.
The work is to identify rules (or constraints) that ensure the correctness of the execution (i.e., ensure that the storage cannot end up in an invalid state).
In fact, post-conditions are usually multiple assertions combined with a logical AND operator (
^ in Coq).
Since post-conditions are a generic concept formalizing the smart contract intention as logical assertions, we will use an example in order to illustrate the modeling ofpost-conditions.
Let's consider a very simple Vote smart contract that handles a voting process. The complete implementation of the theorem and its proof are available at . In this section, we explain the "Vote" reference example. The Vote smart contract allows anyone to vote for a candidate (we consider that candidates are registered and their number of votes is initialized to zero).
When someone invokes the Vote smart contract, one must indicate the candidate. If the candidate is registered then its corresponding number of votes is incremented.
When someone invokes the Vote smart contract it will only modify its storage and thus will have no impact on other smart contract storages. (i.e. the execution of the smart contract will not produce
Here is the code of the smart contract:
Notice that candidates are identified by a
string value (entrypoint argument) and the storage is a
map string int.
Notice that amount of Tez transferred must be lower than five million; otherwise the execution fails.
Notice also that the candidate must be registered; otherwise the execution fails.
This very simple script is equivalent to this pseudo-code:
candidate are given as arguments.
The parameter type and storage type can be defined in Coq as two distinct definitions:
The parameter type (
parameter_ty) can be wrap into a
SelfType definition as follow:
It will be used when defining the smart contract.
The Tezos smart contract is a Michelson script but it cannot be taken as input by the Coq engine as it is.
Mi-Cho-Coq (which is the Coq specification of the Michelson language) provides the correspondence between a Michelson instruction and an equivalent logical proposition.
There is no automated process that translates a Michelson code into a formal definition based on Mi-Cho-Coq definitions. This must be done manually.
The Vote smart contract can be formalized in a formal definition in Coq (Terms part of the Gallina language).
vote definition will be used to formalize the theorem. Notice that it takes the parameter and storage types (
storage_ty) as arguments.
PUSH instructions are annotated:
ADD (s := add_int_int)indicates it is an addition between two integers.
GET (i := get_map string int)indicates it accesses elements into a
map string int.
UPDATE (i := Mk_update string (option int) (map string int) (Update_variant_map string int))indicates it updates (
stringas key and an
option intas value.
As said previously, post-conditions are logical expressions defining constraints to verify on the storage data.
In our example Vote smart contract, the storage is a map containing the number of votes per candidate.
Let's see how we can define logical assertions on the storage data.
First, let's define some rules governing the voting process:
- "When someone votes for a candidate, its number of votes increments by 1".
- "When someone votes for a candidate, the number of votes of other candidates does not change".
- "When someone votes for a candidate, it does not change the list of candidates".
- "If the voting process is successful, then it means that the candidate is registered".
- "Invoking this smart contract does not impact the rest of the Tezos network, only the related storage".
Now, these rules can be translated into formal propositions. These propositions depend on the given parameter, the current storage state and the new storage state (and the produced operations).
FIGURE 4: Post conditions of _Vote_ smart contract.
The rule "Keys of the old storage exists in the new storage" can be written in Coq (Gallina - Terms) with the following:
This expression verifies that all keys of the old storage are defined in the new storage.
The rule "For Bob, number of votes is incremented" can be formulated as: "For each element of the mapping whose key is equal to the given parameter, the new value must be equal to the old value plus one". It can be written in Coq (Gallina - Terms) with the following:
The rule "For others, number of votes do not change" can be formulated as: "For each element of the mapping different from the given parameter, ensure that the old value is equal to the new value". It can be written in Coq (Gallina - Terms) with the following:
The rule "Only the storage is modified" can be expressed by verifying that no operations have been produced. It can be written in Coq (Gallina - Terms) with the following:
As seen previously, the smart contract can be executed only if the amount of Tez transferred is lower than 5000000; otherwise the execution fails. This constraint can be written in Coq (Gallina - Terms) with the following:
To sum up, our post-conditions are a combination of all these logical rules merged into a single object which depends on the given old storage state and parameter, and the resulting new storage state (and the returned operations).
vote_spec represents the post-conditions of the voting process:
Notice that the
vote_spec definition above express logical assertions depending on:
- the initial storage state (
- the returned storage state (
- the parameter (
- the returned operations (
To conclude, the Vote smart contract is defined by the
vote_spec definition and can be used to formalize the theorem.
As said previously, the formal verification of a Tezos smart contract consists in verifying formally that the execution of the Michelson script satisfies specific post-conditions.
Also, as said previously, the theorem can be formalized as:
Here is a schema describing graphically the theorem formalization:
FIGURE 5: Description of the theorem.
Now that we have defined the post-conditions to verify, we can define the theorem in Gallina (Terms) syntax.
Notice that the
vote object represents our smart contract (in a formal representation).
Notice also that the
vote_spec object represents the post-conditions to verify (in a formal representation).
We can represent this equivalence between the execution of the code and the verification of post-conditions by the following diagram.
FIGURE 6: Detailed description of the theorem.
Notice that the
vote_spec definition is used as post condition and requires 4 arguments (
Now that the intention of our smart contract has been modeled into post-conditions and that our smart contract has been translated into a theorem (which combines evaluation of a sequence of Michelson instruction and those logical post-conditions), we need to prove that this theorem is true.
The demonstration or proof of the theorem can be expressed with a sequence of Coq tactics.
Since the theorem is a complex logical proposition, it is suggested to decompose it into simpler propositions easily provable. This decomposition is done by separating into smaller independent propositions or applying reductions (see reductions in Gallina ).
The following proof script relies on:
- tactics (commands of the Vernacular of Gallina)
- induced types (Mi-Cho-Coq)
- proven theorem of Mi-Cho-Coq dealing with Tezos smart contract properties (e.g. gas)
- the Coq universe, which defines sets of numbers and related theorem. For example, natural integers are defined upon the Peano arithmetic.
Here is the proof of the Vote smart contract.
This chapter is not intended to be a Coq tutorial, we will not deep further into this script.
In this section, we provided explanations of the Vote example to illustrate:
- how to translate a Tezos smart contract into a formal definition based on Mi-Cho-Coq definitions
- how to design post-conditions modeling the intention of a smart contract (with the Vote smart contract example)
- how to define a theorem based on the Mi-Cho-coq evaluator and post-conditions.
- a proof (the proof of the Vote theorem).
We also recommend to check other examples provided with the Mi-Cho-Coq repository .
 Mi-cho-coq repository - https://gitlab.com/nomadic-labs/mi-cho-coq
 Introduction to Coq - http://www-sop.inria.fr/members/Yves.Bertot/courses/introcoq.pdf
 Lambda-Calculus and Isomorphism Curry-Howard - http://disi.unitn.it/~bernardi/RSISE11/Papers/curry-howard.pdf
 Isomorphism Curry-Howard for Dummies - https://www.pédrot.fr/slides/inria-junior-02-15.pdf
 Isomorphism Curry-Howard (small) - https://www.seas.harvard.edu/courses/cs152/2015sp/lectures/lec15-curryhoward.pdf
 Higher-order abstract syntax in Coq - https://web.archive.org/web/20060830033826/http://www.site.uottawa.ca/~afelty/dist/tlca95.ps
 Michelson - https://tezos.gitlab.io/michelson-reference/
 Logique formelle - https://www.irif.fr/~roziere/2ord/2ndordre.pdf
 Axioms de Peano - https://fr.wikipedia.org/wiki/Axiomes_de_Peano
 Calculus of constructions - https://fr.wikipedia.org/wiki/Calcul_des_constructions
 Mini-guide Coq - https://www.lri.fr/~paulin/MathInfo/coq-survey.pdf
 Coq’Art - https://www.labri.fr/perso/casteran/CoqArt/coqartF.pdf
 The calculus of constructions (1988) by Thierry Coquand - https://www.sciencedirect.com/science/article/pii/0890540188900053
 Lambda-calcul - https://fr.wikipedia.org/wiki/Lambda-calcul
 Calculus of Inductive Constructions - https://coq.inria.fr/distrib/current/refman/language/cic.html
 Michelson - https://www.michelson-lang.com/why-michelson.html
 Mi-Cho-Coq examples - https://gitlab.com/nomadic-labs/mi-cho-coq/-/blob/master/src/contracts_coq
 Archetype - https://completium.com/docs/verification/specification/