BasicsFunctional Programming in Coq
Introduction
Enumerated Types
Days of the Week
Inductive day : Type :=
| monday : day
| tuesday : day
| wednesday : day
| thursday : day
| friday : day
| saturday : day
| sunday : day.
The type is called day, and its members are monday,
tuesday, etc. The second through eighth lines of the definition
can be read "monday is a day, tuesday is a day, etc."
Having defined day, we can write functions that operate on
days.
Definition next_weekday (d:day) : day :=
match d with
| monday ⇒ tuesday
| tuesday ⇒ wednesday
| wednesday ⇒ thursday
| thursday ⇒ friday
| friday ⇒ monday
| saturday ⇒ monday
| sunday ⇒ monday
end.
One thing to note is that the argument and return types of
this function are explicitly declared. Like most functional
programming languages, Coq can often work out these types even if
they are not given explicitly — i.e., it performs some type
inference — but we'll always include them to make reading
easier.
Having defined a function, we should check that it works on
some examples. There are actually three different ways to do this
in Coq. First, we can use the command Eval compute to evaluate a
compound expression involving next_weekday.
Eval compute in (next_weekday friday).
(* ==> monday : day *)
Eval compute in (next_weekday (next_weekday saturday)).
(* ==> tuesday : day *)
If you have a computer handy, now would be an excellent
moment to fire up the Coq interpreter under your favorite IDE —
either CoqIde or Proof General — and try this for yourself. Load
this file (Basics.v) from the book's accompanying Coq sources,
find the above example, submit it to Coq, and observe the
result.
The keyword compute tells Coq precisely how to
evaluate the expression we give it. For the moment, compute is
the only one we'll need; later on we'll see some alternatives that
are sometimes useful.
Second, we can record what we expect the result to be in
the form of a Coq example:
Example test_next_weekday:
(next_weekday (next_weekday saturday)) = tuesday.
This declaration does two things: it makes an
assertion (that the second weekday after saturday is tuesday),
and it gives the assertion a name that can be used to refer to it
later. Having made the assertion, we can also ask Coq to verify it,
like this:
Proof. simpl. reflexivity. Qed.
The details are not important for now (we'll come back to
them in a bit), but essentially this can be read as "The assertion
we've just made can be proved by observing that both sides of the
equality evaluate to the same thing, after some simplification."
Third, we can ask Coq to "extract," from a Definition, a
program in some other, more conventional, programming
language (OCaml, Scheme, or Haskell) with a high-performance
compiler. This facility is very interesting, since it gives us a
way to construct fully certified programs in mainstream
languages. Indeed, this is one of the main uses for which Coq was
developed. We'll come back to this topic in later chapters.
More information can also be found in the Coq'Art book by Bertot
and Casteran, as well as the Coq reference manual.
Inductive bool : Type :=
| true : bool
| false : bool.
Although we are rolling our own booleans here for the sake
of building up everything from scratch, Coq does, of course,
provide a default implementation of the booleans in its standard
library, together with a multitude of useful functions and
lemmas. (Take a look at Coq.Init.Datatypes in the Coq library
documentation if you're interested.) Whenever possible, we'll
name our own definitions and theorems so that they exactly
coincide with the ones in the standard library.
Functions over booleans can be defined in the same way as
above:
Definition negb (b:bool) : bool :=
match b with
| true ⇒ false
| false ⇒ true
end.
Definition andb (b1:bool) (b2:bool) : bool :=
match b1 with
| true ⇒ b2
| false ⇒ false
end.
Definition orb (b1:bool) (b2:bool) : bool :=
match b1 with
| true ⇒ true
| false ⇒ b2
end.
The last two illustrate the syntax for multi-argument
function definitions.
The following four "unit tests" constitute a complete
specification — a truth table — for the orb function:
Example test_orb1: (orb true false) = true.
Proof. reflexivity. Qed.
Example test_orb2: (orb false false) = false.
Proof. reflexivity. Qed.
Example test_orb3: (orb false true) = true.
Proof. reflexivity. Qed.
Example test_orb4: (orb true true) = true.
Proof. reflexivity. Qed.
(Note that we've dropped the simpl in the proofs. It's not
actually needed because reflexivity will automatically perform
simplification.)
A note on notation: We use square brackets to delimit
fragments of Coq code in comments in .v files; this convention,
also used by the coqdoc documentation tool, keeps them visually
separate from the surrounding text. In the html version of the
files, these pieces of text appear in a different font.
The values Admitted and admit can be used to fill
a hole in an incomplete definition or proof. We'll use them in the
following exercises. In general, your job in the exercises is
to replace admit or Admitted with real definitions or proofs.
This function should return true if either or both of
its inputs are false.
Exercise: 1 star (nandb)
Complete the definition of the following function, then make sure that the Example assertions below can each be verified by Coq.Definition nandb (b1:bool) (b2:bool) : bool :=
(* FILL IN HERE *) admit.
Remove "Admitted." and fill in each proof with
"Proof. reflexivity. Qed."
Example test_nandb1: (nandb true false) = true.
(* FILL IN HERE *) Admitted.
Example test_nandb2: (nandb false false) = true.
(* FILL IN HERE *) Admitted.
Example test_nandb3: (nandb false true) = true.
(* FILL IN HERE *) Admitted.
Example test_nandb4: (nandb true true) = false.
(* FILL IN HERE *) Admitted.
☐
Exercise: 1 star (andb3)
Do the same for the andb3 function below. This function should return true when all of its inputs are true, and false otherwise.Definition andb3 (b1:bool) (b2:bool) (b3:bool) : bool :=
(* FILL IN HERE *) admit.
Example test_andb31: (andb3 true true true) = true.
(* FILL IN HERE *) Admitted.
Example test_andb32: (andb3 false true true) = false.
(* FILL IN HERE *) Admitted.
Example test_andb33: (andb3 true false true) = false.
(* FILL IN HERE *) Admitted.
Example test_andb34: (andb3 true true false) = false.
(* FILL IN HERE *) Admitted.
☐
Function Types
Check true.
(* ===> true : bool *)
Check (negb true).
(* ===> negb true : bool *)
Functions like negb itself are also data values, just like
true and false. Their types are called function types, and
they are written with arrows.
Check negb.
(* ===> negb : bool -> bool *)
The type of negb, written bool → bool and pronounced
"bool arrow bool," can be read, "Given an input of type
bool, this function produces an output of type bool."
Similarly, the type of andb, written bool → bool → bool, can
be read, "Given two inputs, both of type bool, this function
produces an output of type bool."
Numbers
Module Playground1.
The types we have defined so far are examples of "enumerated
types": their definitions explicitly enumerate a finite set of
elements. A more interesting way of defining a type is to give a
collection of "inductive rules" describing its elements. For
example, we can define the natural numbers as follows:
Inductive nat : Type :=
| O : nat
| S : nat → nat.
The clauses of this definition can be read:
Let's look at this in a little more detail.
Every inductively defined set (day, nat, bool, etc.) is
actually a set of expressions. The definition of nat says how
expressions in the set nat can be constructed:
These three conditions are the precise force of the
Inductive declaration. They imply that the expression O, the
expression S O, the expression S (S O), the expression
S (S (S O)), and so on all belong to the set nat, while other
expressions like true, andb true false, and S (S false) do
not.
We can write simple functions that pattern match on natural
numbers just as we did above — for example, the predecessor
function:
- O is a natural number (note that this is the letter "O," not the numeral "0").
- S is a "constructor" that takes a natural number and yields another one — that is, if n is a natural number, then S n is too.
- the expression O belongs to the set nat;
- if n is an expression belonging to the set nat, then S n is also an expression belonging to the set nat; and
- expressions formed in these two ways are the only ones belonging to the set nat.
Definition pred (n : nat) : nat :=
match n with
| O ⇒ O
| S n' ⇒ n'
end.
The second branch can be read: "if n has the form S n'
for some n', then return n'."
End Playground1.
Definition minustwo (n : nat) : nat :=
match n with
| O ⇒ O
| S O ⇒ O
| S (S n') ⇒ n'
end.
Because natural numbers are such a pervasive form of data,
Coq provides a tiny bit of built-in magic for parsing and printing
them: ordinary arabic numerals can be used as an alternative to
the "unary" notation defined by the constructors S and O. Coq
prints numbers in arabic form by default:
Check (S (S (S (S O)))).
Eval compute in (minustwo 4).
The constructor S has the type nat → nat, just like the
functions minustwo and pred:
Check S.
Check pred.
Check minustwo.
These are all things that can be applied to a number to yield a
number. However, there is a fundamental difference: functions
like pred and minustwo come with computation rules — e.g.,
the definition of pred says that pred 2 can be simplified to
1 — while the definition of S has no such behavior attached.
Although it is like a function in the sense that it can be applied
to an argument, it does not do anything at all!
For most function definitions over numbers, pure pattern
matching is not enough: we also need recursion. For example, to
check that a number n is even, we may need to recursively check
whether n-2 is even. To write such functions, we use the
keyword Fixpoint.
Fixpoint evenb (n:nat) : bool :=
match n with
| O ⇒ true
| S O ⇒ false
| S (S n') ⇒ evenb n'
end.
We can define oddb by a similar Fixpoint declaration, but here
is a simpler definition that will be a bit easier to work with:
Definition oddb (n:nat) : bool := negb (evenb n).
Example test_oddb1: (oddb (S O)) = true.
Proof. reflexivity. Qed.
Example test_oddb2: (oddb (S (S (S (S O))))) = false.
Proof. reflexivity. Qed.
Naturally, we can also define multi-argument functions by
recursion. (Once again, we use a module to avoid polluting the
namespace.)
Module Playground2.
Fixpoint plus (n : nat) (m : nat) : nat :=
match n with
| O ⇒ m
| S n' ⇒ S (plus n' m)
end.
Adding three to two now gives us five, as we'd expect.
Eval compute in (plus (S (S (S O))) (S (S O))).
The simplification that Coq performs to reach this conclusion can
be visualized as follows:
(* plus (S (S (S O))) (S (S O))
==> S (plus (S (S O)) (S (S O))) by the second clause of the match
==> S (S (plus (S O) (S (S O)))) by the second clause of the match
==> S (S (S (plus O (S (S O))))) by the second clause of the match
==> S (S (S (S (S O)))) by the first clause of the match
*)
As a notational convenience, if two or more arguments have
the same type, they can be written together. In the following
definition, (n m : nat) means just the same as if we had written
(n : nat) (m : nat).
Fixpoint mult (n m : nat) : nat :=
match n with
| O ⇒ O
| S n' ⇒ plus m (mult n' m)
end.
Example test_mult1: (mult 3 3) = 9.
Proof. reflexivity. Qed.
You can match two expressions at once by putting a comma
between them:
Fixpoint minus (n m:nat) : nat :=
match n, m with
| O , _ ⇒ O
| S _ , O ⇒ n
| S n', S m' ⇒ minus n' m'
end.
The _ in the first line is a wildcard pattern. Writing _ in a
pattern is the same as writing some variable that doesn't get used
on the right-hand side. This avoids the need to invent a bogus
variable name.
End Playground2.
Fixpoint exp (base power : nat) : nat :=
match power with
| O ⇒ S O
| S p ⇒ mult base (exp base p)
end.
Exercise: 1 star (factorial)
Recall the standard factorial function:factorial(0) = 1 factorial(n) = n * factorial(n-1) (if n>0)Translate this into Coq.
Fixpoint factorial (n:nat) : nat :=
(* FILL IN HERE *) admit.
Example test_factorial1: (factorial 3) = 6.
(* FILL IN HERE *) Admitted.
Example test_factorial2: (factorial 5) = (mult 10 12).
(* FILL IN HERE *) Admitted.
☐
We can make numerical expressions a little easier to read and
write by introducing "notations" for addition, multiplication, and
subtraction.
Notation "x + y" := (plus x y)
(at level 50, left associativity)
: nat_scope.
Notation "x - y" := (minus x y)
(at level 50, left associativity)
: nat_scope.
Notation "x × y" := (mult x y)
(at level 40, left associativity)
: nat_scope.
Check ((0 + 1) + 1).
(The level, associativity, and nat_scope annotations
control how these notations are treated by Coq's parser. The
details are not important, but interested readers can refer to the
"More on Notation" subsection in the "Optional Material" section at
the end of this chapter.)
Note that these do not change the definitions we've already
made: they are simply instructions to the Coq parser to accept x
+ y in place of plus x y and, conversely, to the Coq
pretty-printer to display plus x y as x + y.
When we say that Coq comes with nothing built-in, we really
mean it: even equality testing for numbers is a user-defined
operation! The beq_nat function tests natural numbers for equality,
yielding a boolean. Note the use of nested matches (we could
also have used a simultaneous match, as we did in minus.)
Fixpoint beq_nat (n m : nat) : bool :=
match n with
| O ⇒ match m with
| O ⇒ true
| S m' ⇒ false
end
| S n' ⇒ match m with
| O ⇒ false
| S m' ⇒ beq_nat n' m'
end
end.
Similarly, the ble_nat function tests natural numbers for
less-or-equal, yielding a boolean.
Fixpoint ble_nat (n m : nat) : bool :=
match n with
| O ⇒ true
| S n' ⇒
match m with
| O ⇒ false
| S m' ⇒ ble_nat n' m'
end
end.
Example test_ble_nat1: (ble_nat 2 2) = true.
Proof. reflexivity. Qed.
Example test_ble_nat2: (ble_nat 2 4) = true.
Proof. reflexivity. Qed.
Example test_ble_nat3: (ble_nat 4 2) = false.
Proof. reflexivity. Qed.
Exercise: 2 stars (blt_nat)
The blt_nat function tests natural numbers for less-than, yielding a boolean. Instead of making up a new Fixpoint for this one, define it in terms of a previously defined function.Definition blt_nat (n m : nat) : bool :=
(* FILL IN HERE *) admit.
Example test_blt_nat1: (blt_nat 2 2) = false.
(* FILL IN HERE *) Admitted.
Example test_blt_nat2: (blt_nat 2 4) = true.
(* FILL IN HERE *) Admitted.
Example test_blt_nat3: (blt_nat 4 2) = false.
(* FILL IN HERE *) Admitted.
☐
Proof by Simplification
Theorem plus_O_n : ∀n : nat, 0 + n = n.
Proof.
intros n. reflexivity. Qed.
(Note: You may notice that the above statement looks
different in the original source file and the final html output. In Coq
files, we write the ∀ universal quantifier using the
"forall" reserved identifier. This gets printed as an
upside-down "A", the familiar symbol used in logic.)
The form of this theorem and proof are almost exactly the
same as the examples above; there are just a few differences.
First, we've used the keyword Theorem instead of
Example. Indeed, the difference is purely a matter of
style; the keywords Example and Theorem (and a few others,
including Lemma, Fact, and Remark) mean exactly the same
thing to Coq.
Secondly, we've added the quantifier ∀ n:nat, so that our
theorem talks about all natural numbers n. In order to prove
theorems of this form, we need to to be able to reason by
assuming the existence of an arbitrary natural number n. This
is achieved in the proof by intros n, which moves the quantifier
from the goal to a "context" of current assumptions. In effect, we
start the proof by saying "OK, suppose n is some arbitrary number."
The keywords intros, simpl, and reflexivity are examples of
tactics. A tactic is a command that is used between Proof and
Qed to tell Coq how it should check the correctness of some
claim we are making. We will see several more tactics in the rest
of this lecture, and yet more in future lectures.
Step through these proofs in Coq and notice how the goal and
context change.
Theorem plus_1_l : ∀n:nat, 1 + n = S n.
Proof.
intros n. reflexivity. Qed.
Theorem mult_0_l : ∀n:nat, 0 × n = 0.
Proof.
intros n. reflexivity. Qed.
The _l suffix in the names of these theorems is
pronounced "on the left."
Theorem plus_id_example : ∀n m:nat,
n = m →
n + n = m + m.
Instead of making a completely universal claim about all numbers
n and m, this theorem talks about a more specialized property
that only holds when n = m. The arrow symbol is pronounced
"implies."
As before, we need to be able to reason by assuming the existence
of some numbers n and m. We also need to assume the hypothesis
n = m. The intros tactic will serve to move all three of these
from the goal into assumptions in the current context.
Since n and m are arbitrary numbers, we can't just use
simplification to prove this theorem. Instead, we prove it by
observing that, if we are assuming n = m, then we can replace
n with m in the goal statement and obtain an equality with the
same expression on both sides. The tactic that tells Coq to
perform this replacement is called rewrite.
Proof.
intros n m. (* move both quantifiers into the context *)
intros H. (* move the hypothesis into the context *)
rewrite → H. (* Rewrite the goal using the hypothesis *)
reflexivity. Qed.
The first line of the proof moves the universally quantified
variables n and m into the context. The second moves the
hypothesis n = m into the context and gives it the (arbitrary)
name H. The third tells Coq to rewrite the current goal (n + n
= m + m) by replacing the left side of the equality hypothesis
H with the right side.
(The arrow symbol in the rewrite has nothing to do with
implication: it tells Coq to apply the rewrite from left to right.
To rewrite from right to left, you can use rewrite ←. Try
making this change in the above proof and see what difference it
makes in Coq's behavior.)
Exercise: 1 star (plus_id_exercise)
Remove "Admitted." and fill in the proof.Theorem plus_id_exercise : ∀n m o : nat,
n = m → m = o → n + m = m + o.
Proof.
(* FILL IN HERE *) Admitted.
☐
As we've seen in earlier examples, the Admitted command
tells Coq that we want to skip trying to prove this theorem and
just accept it as a given. This can be useful for developing
longer proofs, since we can state subsidiary facts that we believe
will be useful for making some larger argument, use Admitted to
accept them on faith for the moment, and continue thinking about
the larger argument until we are sure it makes sense; then we can
go back and fill in the proofs we skipped. Be careful, though:
every time you say Admitted (or admit) you are leaving a door
open for total nonsense to enter Coq's nice, rigorous, formally
checked world!
We can also use the rewrite tactic with a previously proved
theorem instead of a hypothesis from the context.
Theorem mult_0_plus : ∀n m : nat,
(0 + n) × m = n × m.
Proof.
intros n m.
rewrite → plus_O_n.
reflexivity. Qed.
Theorem mult_S_1 : ∀n m : nat,
m = S n →
m × (1 + n) = m × m.
Proof.
(* FILL IN HERE *) Admitted.
m = S n →
m × (1 + n) = m × m.
Proof.
(* FILL IN HERE *) Admitted.
☐
Proof by Case Analysis
Theorem plus_1_neq_0_firsttry : ∀n : nat,
beq_nat (n + 1) 0 = false.
Proof.
intros n.
simpl. (* does nothing! *)
Abort.
The reason for this is that the definitions of both
beq_nat and + begin by performing a match on their first
argument. But here, the first argument to + is the unknown
number n and the argument to beq_nat is the compound
expression n + 1; neither can be simplified.
What we need is to be able to consider the possible forms of n
separately. If n is O, then we can calculate the final result
of beq_nat (n + 1) 0 and check that it is, indeed, false.
And if n = S n' for some n', then, although we don't know
exactly what number n + 1 yields, we can calculate that, at
least, it will begin with one S, and this is enough to calculate
that, again, beq_nat (n + 1) 0 will yield false.
The tactic that tells Coq to consider, separately, the cases where
n = O and where n = S n' is called destruct.
Theorem plus_1_neq_0 : ∀n : nat,
beq_nat (n + 1) 0 = false.
Proof.
intros n. destruct n as [| n'].
reflexivity.
reflexivity. Qed.
The destruct generates two subgoals, which we must then
prove, separately, in order to get Coq to accept the theorem as
proved. (No special command is needed for moving from one subgoal
to the other. When the first subgoal has been proved, it just
disappears and we are left with the other "in focus.") In this
proof, each of the subgoals is easily proved by a single use of
reflexivity.
The annotation "as [| n']" is called an intro pattern. It
tells Coq what variable names to introduce in each subgoal. In
general, what goes between the square brackets is a list of
lists of names, separated by |. Here, the first component is
empty, since the O constructor is nullary (it doesn't carry any
data). The second component gives a single name, n', since S
is a unary constructor.
The destruct tactic can be used with any inductively defined
datatype. For example, we use it here to prove that boolean
negation is involutive — i.e., that negation is its own
inverse.
Theorem negb_involutive : ∀b : bool,
negb (negb b) = b.
Proof.
intros b. destruct b.
reflexivity.
reflexivity. Qed.
Note that the destruct here has no as clause because
none of the subcases of the destruct need to bind any variables,
so there is no need to specify any names. (We could also have
written as [|], or as [].) In fact, we can omit the as
clause from any destruct and Coq will fill in variable names
automatically. Although this is convenient, it is arguably bad
style, since Coq often makes confusing choices of names when left
to its own devices.
Exercise: 1 star (zero_nbeq_plus_1)
Theorem zero_nbeq_plus_1 : ∀n : nat,
beq_nat 0 (n + 1) = false.
Proof.
(* FILL IN HERE *) Admitted.
beq_nat 0 (n + 1) = false.
Proof.
(* FILL IN HERE *) Admitted.
☐
More Exercises
Exercise: 2 stars (boolean functions)
Use the tactics you have learned so far to prove the following theorem about boolean functions.Theorem identity_fn_applied_twice :
∀(f : bool → bool),
(∀(x : bool), f x = x) →
∀(b : bool), f (f b) = b.
Proof.
(* FILL IN HERE *) Admitted.
Now state and prove a theorem negation_fn_applied_twice similar
to the previous one but where the second hypothesis says that the
function f has the property that f x = negb x.
(* FILL IN HERE *)
Exercise: 2 stars (andb_eq_orb)
Prove the following theorem. (You may want to first prove a subsidiary lemma or two. Alternatively, remember that you do not have to introduce all hypotheses at the same time.)Theorem andb_eq_orb :
∀(b c : bool),
(andb b c = orb b c) →
b = c.
Proof.
(* FILL IN HERE *) Admitted.
Exercise: 3 stars (binary)
Consider a different, more efficient representation of natural numbers using a binary rather than unary system. That is, instead of saying that each natural number is either zero or the successor of a natural number, we can say that each binary number is either- zero,
- twice a binary number, or
- one more than twice a binary number.
Inductive nat : Type :=
| O : nat
| S : nat → nat.
says nothing about what O and S "mean." It just says "O is
in the set called nat, and if n is in the set then so is S
n." The interpretation of O as zero and S as successor/plus
one comes from the way that we use nat values, by writing
functions to do things with them, proving things about them, and
so on. Your definition of bin should be correspondingly simple;
it is the functions you will write next that will give it
mathematical meaning.)
| O : nat
| S : nat → nat.
(* FILL IN HERE *)
☐
Notation "x + y" := (plus x y)
(at level 50, left associativity)
: nat_scope.
Notation "x × y" := (mult x y)
(at level 40, left associativity)
: nat_scope.
For each notation-symbol in Coq we can specify its precedence level
and its associativity. The precedence level n can be specified by the
keywords at level n and it is helpful to disambiguate
expressions containing different symbols. The associativity is helpful
to disambiguate expressions containing more occurrences of the same
symbol. For example, the parameters specified above for + and ×
say that the expression 1+2×3×4 is a shorthand for the expression
(1+((2×3)×4)). Coq uses precedence levels from 0 to 100, and
left, right, or no associativity.
Each notation-symbol in Coq is also active in a notation scope.
Coq tries to guess what scope you mean, so when you write S(O×O)
it guesses nat_scope, but when you write the cartesian
product (tuple) type bool×bool it guesses type_scope.
Occasionally you have to help it out with percent-notation by
writing (x×y)%nat, and sometimes in Coq's feedback to you it
will use %nat to indicate what scope a notation is in.
Notation scopes also apply to numeral notation (3,4,5, etc.), so you
may sometimes see 0%nat which means O, or 0%Z which means the
Integer zero.
Fixpoints and Structural Recursion
Fixpoint plus' (n : nat) (m : nat) : nat :=
match n with
| O ⇒ m
| S n' ⇒ S (plus' n' m)
end.
When Coq checks this definition, it notes that plus' is
"decreasing on 1st argument." What this means is that we are
performing a structural recursion over the argument n — i.e.,
that we make recursive calls only on strictly smaller values of
n. This implies that all calls to plus' will eventually
terminate. Coq demands that some argument of every Fixpoint
definition is "decreasing".
This requirement is a fundamental feature of Coq's design: In
particular, it guarantees that every function that can be defined
in Coq will terminate on all inputs. However, because Coq's
"decreasing analysis" is not very sophisticated, it is sometimes
necessary to write functions in slightly unnatural ways.
Exercise: 2 stars, optional (decreasing)
To get a concrete sense of this, find a way to write a sensible Fixpoint definition (of a simple function on numbers, say) that does terminate on all inputs, but that Coq will not accept because of this restriction.(* FILL IN HERE *)
☐
(* $Date: 2013-09-26 14:40:26 -0400 (Thu, 26 Sep 2013) $ *)