This paper appears in the proceedings of the 18th International Conference on Theorem Proving in Higher Order Logics (TPHOLs 2005).
A constructive proof of the Gödel-Rosser incompleteness theorem has been completed using the Coq proof assistant. Some theory of classical first-order logic over an arbitrary language is formalized. A development of primitive recursive functions is given, and all primitive recursive functions are proved to be representable in a weak axiom system. Formulas and proofs are encoded as natural numbers, and functions operating on these codes are proved to be primitive recursive. The weak axiom system is proved to be essentially incomplete. In particular, Peano arithmetic is proved to be consistent in Coq’s type theory and therefore is incomplete.
This work is hereby released into the Public Domain. To view a copy of the public domain dedication, visit 〈http://creativecommons.org/licenses/publicdomain/〉 or send a letter to Creative Commons, 559 Nathan Abbott Way, Stanford, California 94305, USA.
The Gödel-Rosser incompleteness theorem for arithmetic states that any complete first-order theory of a nice axiom system, using only the symbols +, ×, 0, S, and < is inconsistent. A nice axiom system must contain the nine specific axioms of a system called NN. These nine axioms serve to define the previous symbols. A nice axiom system must also be expressible in itself. This last restriction prevents the incompleteness theorem from applying to axioms systems such as the true first order statements about ℕ.
A computer verified proof of Gödel’s incompleteness theorem is not new. In 1986 Shankar created a proof of the incompleteness of Z2, hereditarily finite set theory, in the Boyer-Moore theorem prover. My work is the first computer verified proof of the essential incompleteness of arithmetic. Harrison recently completed a proof in HOL Light of the essential incompleteness of Σ1-complete theories, but has not shown that any particular theory is Σ1-complete. His work will be included in the next release of HOL Light.
My proof was developed and checked in Coq 7.3.1 using Proof General under XEmacs. It is part of the user contributions to Coq and can be checked in Coq 8.0. Examples of source code in this document use the new Coq 8.0 notation.
Coq is an implementation of the calculus of (co)inductive constructions.
This dependent type theory has intensional equality and is constructive, so
my proof is constructive.
Actually the proof depends on the Ensembles
library which
declares an axiom of extensionality for Ensembles
, but this axiom
is never used.
This document points out some of the more interesting problems I encountered when formalizing the incompleteness theorem. My proof mostly follows the presentation of incompleteness given in An Introduction to Mathematical Logic. I referred to the supplementary text for the book Logic for Mathematics and Computer Science to construct Gödel’s β-function. I also use part of Caprotti and Oostdijk’s contribution of Pocklington’s criterion to prove the Chinese remainder theorem.
This document is organized as follows. First I discuss the difficulties I had when formalizing classical first-order logic over an arbitrary language. This is followed by the definition of a language LNN and an axiom system called NN. Next I give the statement of the essential incompleteness of NN. Then I briefly discuss coding formulas and proofs as natural numbers. Next I discuss primitive recursive functions and the problems I encountered when trying to prove that substitution can be computed by a primitive recursive function. Finally I briefly discuss the fixed-point theorem, Rosser’s incompleteness theorem, and the incompleteness of PA. At the end I give some remarks about how to extend my work in order to formalize Gödel’s second incompleteness theorem.
For those not familiar with Coq syntax, here is a short list of notation
->
, /\
, \/
, and ~
are the logical
connectives ⇒, ∧, ∨, and ¬.A -> B
, A * B
, and A + B
form function types,
Cartesian product types, and disjoint union types.*
, +
, and S
are the arithmetic operations of multiplication, addition, and successor.inl
and inr
are the left and right
injection functions of types A -> A + B
and B -> A + B
.::
, and ++
are the list operations cons, and append._
is an omitted parameter that Coq can infer itself.For more details see the Coq 8.0 reference manual.
I began by developing the theory of first order classical logic inside Coq. In essence Coq’s logic is a formal metalogic to reason about this internal logic.
Language
I immediately took advantage of Coq’s dependent type system
by defining Language
to be a dependent record of
types for symbols
and an arity function from symbols to ℕ.
The Coq code is:
Record Language : Type := language {Relations : Set; Functions : Set; arity : Relations + Functions -> nat}.
In retrospect it would have been slightly more convenient to use two arity functions instead of using the disjoint union type.
This approach differs from Harrison’s definition of first order terms and formulas in HOL Light because HOL Light does not have dependent types. Dependent types allow the type system to enforce that all terms and formulas of a given language are well formed.
Term
For any given language, a
Term
is either a variable indexed by a natural number or a
function symbol plus a list of n terms where n is the arity of
the function symbol.
My first attempt at writing this in Coq failed.
Variable L : Language. (* Invalid definition *) Inductive Term0 : Set := | var0 : nat -> Term0 | apply0 : forall (f : Functions L) (l : List Term0), (arity L (inr _ f))=(length l) -> Term0.
The type (arity L (inr _ f))=(length l)
fails to meet Coq’s positivity
requirement for inductive types. Expanding the definition of
length
reveals a hidden occurrence of Term0
which is passed
as an implicit argument to length
.
It is this occurrence that violates the
positivity
requirement.
My second attempt met the
positivity
requirement,
but it had other
difficulties. A common way to create a
polymorphic lists of length n
is:
Inductive Vector (A : Set) : nat -> Set := | Vnil : Vector A 0 | Vcons : forall (a : A) (n : nat), Vector A n -> Vector A (S n).
Using this I could have defined Term
like:
Variable L : Language. Inductive Term1 : Set := | var1 : nat -> Term1 | apply1 : forall f : Functions L, (Vector Term1 (arity L (inr _ f))) -> Term1.
My difficulty with this definition was that the induction principle generated by Coq is too weak to work with.
Instead I created two mutually
inductive types: Term
and Terms
.
Variable L : Language. Inductive Term : Set := | var : nat -> Term | apply : forall f : Functions L, Terms (arity L (inr _ f)) -> Term with Terms : nat -> Set := | Tnil : Terms 0 | Tcons : forall n : nat, Term -> Terms n -> Terms (S n).
Again the automatically
generated induction principle
is too weak, so I used the Scheme
command
to generate suitable mutual-inductive principles.
The disadvantage of this approach is that useful lemmas about
Vector
s must be reproved for Term
s. Some of these lemmas
are quite tricky to prove because of the dependent type. For example,
proving forall x : Terms 0, Tnil = x
is not easy.
Recently,
Marche has shown me that the Term1
definition
would be adequate.
One can explicitly make a sufficient induction principle by using
nested Fixpoint
functions.
Formula
The definition of Formula
was straightforward.
Inductive Formula : Set := | equal : Term -> Term -> Formula | atomic : forall r : Relations L, Terms (arity L (inl _ r)) -> Formula | impH : Formula -> Formula -> Formula | notH : Formula -> Formula | forallH : nat -> Formula -> Formula.
I defined the other logical connectives in terms
of impH
, notH
, and forallH
.
The H
at the end of the logic connectives (such as impH
) stands for
“Hilbert” and is used to distinguish them from Coq’s
connectives.
For example, the formula ¬∀x0. ∀x1. x0 = x1 would be represented by:
notH (forallH 0 (forallH 1 (equal (var 0) (var 1))))
It would be nice to use higher order abstract syntax
to handle bound variables by giving forallH
the type
(Term -> Formula) -> Formula
.
I would represent the above example as:
notH (forallH (fun x : Term => (forallH (fun y : Term => (equal x y)))))
This technique would require addition work to disallow “exotic
terms” that are created by passing a function into forallH
that
does a case analysis
on the term and returning entirely different formulas in different cases.
Despeyroux et al.
address this problem by creating
a complicated predicate that only valid formulas satisfy.
Another choice would have been to use de Bruijn indexes to eliminate named variables. However dealing with free and bound variables with de Bruijn indexes can be difficult.
Using named variables allowed me to closely follow Hodel’s work. Also, in order to help persuade people that the statement of the incompleteness theorem is correct, it is helpful to make the underlying definitions as familiar as possible.
Renaming bound variables turned out to be a constant source of work during development because variable names and terms were almost always abstract. In principle the variable names could conflict, so it was constantly necessary to consider this case and deal with it by renaming a bound variable to a fresh one. Perhaps it would have been better to use de Bruijn indexes and a deduction system that only deduced closed formulas.
substituteFormula
I defined
the function
substituteFormula
to substitute a term for all occurrences
of a free variable inside a given formula. While the definition of
substituteTerm
is simple structural recursion,
substitution for formulas is complicated by quantifiers.
Suppose we want to substitute the term
s for xi in the formula ∀xj. φ
and i ≠ j.
Suppose xj is a free variable of s.
If we naïvely perform the substitution then the occurrences of
xj in s get captured by the quantifier.
One common solution
to this problem is to disallow substitution for a term s when
s is not substitutable for xi in φ.
The solution I take is to rename the bound variable in this case.
Unfortunately this definition is not structurally recursive. The second substitution operates on the result of the first substitution, which is not structurally smaller than the original formula. Coq will not accept this recursive definition as is; it is necessary to prove the recursion will terminate. I proved that substitution preserves the depth of a formula, and that each recursive call operates on a formula of smaller depth.
One of McBride’s mantras says,
If my recursion is
not structural, I am using the wrong structure
(241).
In this case, my recursion is not structural because I am using the wrong
recursion.
Stoughton shows that
it is easier to define substitution that
substitutes all variables simultaneously because the recursion
is structural.
If I had made
this definition first, I could have defined
substitution of one variable in terms of it and
many of my difficulties would have disappeared.
Prf
I defined the inductive type (Prf Gamma phi)
to be
the type of proofs of phi
, from the list of assumptions Gamma
.
Inductive Prf : Formulas -> Formula -> Set := | AXM : forall A : Formula, Prf (A :: nil) A | MP : forall (Axm1 Axm2 : Formulas) (A B : Formula), Prf Axm1 (impH A B) -> Prf Axm2 A -> Prf (Axm1 ++ Axm2) B | GEN : forall (Axm : Formulas) (A : Formula) (v : nat), ~ In v (freeVarListFormula L Axm) -> Prf Axm A -> Prf Axm (forallH v A) | IMP1 : forall A B : Formula, Prf nil (impH A (impH B A)) | IMP2 : forall A B C : Formula, Prf nil (impH (impH A (impH B C)) (impH (impH A B) (impH A C))) | CP : forall A B : Formula, Prf nil (impH (impH (notH A) (notH B)) (impH B A)) | FA1 : forall (A : Formula) (v : nat) (t : Term), Prf nil (impH (forallH v A) (substituteFormula L A v t)) | FA2 : forall (A : Formula) (v : nat), ~ In v (freeVarFormula L A) -> Prf nil (impH A (forallH v A)) | FA3 : forall (A B : Formula) (v : nat), Prf nil (impH (forallH v (impH A B)) (impH (forallH v A) (forallH v B))) | EQ1 : Prf nil (equal (var 0) (var 0)) | EQ2 : Prf nil (impH (equal (var 0) (var 1)) (equal (var 1) (var 0))) | EQ3 : Prf nil (impH (equal (var 0) (var 1)) (impH (equal (var 1) (var 2)) (equal (var 0) (var 2)))) | EQ4 : forall R : Relations L, Prf nil (AxmEq4 R) | EQ5 : forall f : Functions L, Prf nil (AxmEq5 f).
AxmEq4
and AxmEq5
are recursive functions that
generate the equality axioms for relations and
functions. AxmEq4 R
generates
and AxmEq5 f
generates
I found that replacing ellipses from informal proofs with recursive functions was one of the most difficult tasks. The informal proof does not contain information on what inductive hypothesis should be used when reasoning about these recursive definitions. Figuring out the correct inductive hypotheses was not easy.
SysPrf
There are some problems with the definition of Prf
given. It
requires the list of axioms to be in the correct order for the proof.
For example, if we have Prf Gamma1 (impH phi psi)
and
Prf Gamma2 phi
then we can conclude only
Prf Gamma1++Gamma2 psi
. We cannot conclude
Prf Gamma2++Gamma1 psi
or any other permutation of psi
.
If an axiom is used more than once, it must appear in the list more
than once. If an axiom is never used, it must not appear. Also, the
number of axioms must be finite because they form a list.
To solve this problem, I defined a System
to be
Ensemble Formula
, and
(SysPrf T phi)
to be the proposition that the system
T
proves phi
.
Definition System := Ensemble Formula. Definition mem := Ensembles.In. Definition SysPrf (T : System) (f : Formula) := exists Axm : Formulas, (exists prf : Prf Axm f, (forall g : Formula, In g Axm -> mem _ T g)).
Ensemble A
represents subsets of
A
by the functions A -> Prop
. a : A
is considered to be
a member of T : Ensemble A
if and only if the type T a
is
inhabited.
I also defined mem
to be Ensembles.In
so that it does not conflict with List.In
.
The deduction theorem states that if Γ∪{φ} ⊢ ψ then Γ ⊢ φ ⇒ ψ.
There is a choice of whether the side condition
for the ∀-generalization rule,
~ In v (freeVarListFormula L Axm)
,
should be required or not.
If this side condition is removed then the deduction theorem
requires a side condition on it.
Usually
all the formulas in an axiom system are closed, so the side condition on
the ∀-generalization is easy to show.
So I decided to keep the side condition on the ∀-generalization
rule.
At one point the proof of the deduction theorem requires proving that
if
Γ ∪ {φ} ⊢ ψ because
ψ ∈ Γ ∪ {φ}, then
Γ ⊢ φ ⇒ ψ. There are two cases to
consider. If ψ = φ then the result easily follows from the
reflexivity of ⇒. Otherwise ψ ∈ Γ, and
therefore
Γ ⊢ ψ. The result then follows.
In order to constructively make this choice it is necessary to decide whether
ψ = φ or not. This requires Formula
to be a decidable
type, and that requires the language L
to be decidable. Since
L
could be anything, I needed to add hypotheses that the function
and relation symbols are decidable types.
forall x y : Functions L, {x=y} + {x<>y}
forall x y : Relations L, {x=y} + {x<>y}
.I used the deduction theorem without restriction and ended up using the hypotheses in many lemmas. I expect that many of these lemmas could be proved without assuming the decidability of the language. It is hard to imagine a useful language that is not decidable, so I do not feel too bad about using these hypotheses in unnecessary places.
I created two languages.
The first language, LNT
, is the
language of number theory and just has
the function symbols Plus
, Times
, Succ
, and
Zero
with appropriate arities.
The second language, LNN
, is the language of NN and has
the same function symbols as LNT
plus
one relation symbol for less than, LT
.
I define two axiom systems: NN
and PA
.
NN
and PA
share six axioms.
NN
has three additional axioms about less than.
PA
has an infinite number of induction axioms that follow one
schema.
The xi1, …, xin are the free variables of ∀xj. φ. The quantifiers ensure that all the axioms of PA are closed.
Because NN is in a different language than PA, a proof in NN is not
a proof in PA. In order to reuse the work done in NN, I created a
function called LNN2LNT_formula
to convert formulas in LNN
into formulas
in LNT
by replacing occurrences of t0 < t1 with
(∃x2. x0+(Sx2)=x1)[x0/t0,
x1/t1]—φ[x0/t0, x1/t1] is the simultaneous
substitution of t0 for x0 and t1 for x1. Then I
proved that if NN ⊢ φ then
PA ⊢ LNN2LNT_formula
(φ).
I also created the function natToTerm : nat -> Term
to return the
closed term representing a given natural number. In this document I will
refer to this function as ⌜·⌝, so ⌜0⌝ = 0,
⌜1⌝ = S0, etc.
To prove the incompleteness theorem, it is necessary for the inner logic to reason about proofs and formulas, but the inner logic can only reason about natural numbers. It is therefore necessary to code proofs and formulas as natural numbers.
Gödel’s original approach was to code a formula as a list
of numbers and then code that list using properties from the prime decomposition
theorem. I avoided needing theorems about prime decomposition by
using the Cantor pairing function instead.
The Cantor pairing function,
cPair
, is a
commonly used bijection between ℕ × ℕ and ℕ.
cPair
(a, b)
≝
a + ∑i = 1, …, a + b i All my inductive structures were easy to recursively encode.
I gave each constructor a unique number and paired that number with the
encoding of all its parameters. For example,
I defined codeFormula
as:
Fixpoint codeFormula (f : Formula) : nat := match f with | fol.equal t1 t2 => cPair 0 (cPair (codeTerm t1) (codeTerm t2)) | fol.impH f1 f2 => cPair 1 (cPair (codeFormula f1) (codeFormula f2)) | fol.notH f1 => cPair 2 (codeFormula f1) | fol.forallH n f1 => cPair 3 (cPair n (codeFormula f1)) | fol.atomic R ts => cPair (4+(codeR R)) (codeTerms _ ts) end.
where codeR
is a coding of the relation symbols for the
language.
I will use ⌜φ⌝ for
⌜codeFormula
φ⌝
and ⌜t⌝ for ⌜codeTerm
t⌝.
The incompleteness theorem states the essential incompleteness of NN, meaning that for every axiom system T such that
then there exists a sentence φ such that if T ⊢ φ or T ⊢ ¬φ then T is inconsistent.
The theorem is only about proofs in LNN
, the
language of NN. This statement does not show the incompleteness of
theories that extend the language.
In Coq the theorem is stated as as:
Theorem Incompleteness : forall T : System, Included Formula NN T -> RepresentsInSelf T -> DecidableSet Formula T -> exists f : Formula, Sentence f /\ (SysPrf T f \/ SysPrf T (notH f) -> Inconsistent LNN T).
A System
is Inconsistent
if it proves
all formulas.
Definition Inconsistent (T : System) := forall f : Formula, SysPrf T f.
A Sentence
is a Formula
without any free variables.
Definition Sentence (f : Formula) := forall v : nat, ~ In v (freeVarFormula LNN f).
A DecidableSet
is an Ensemble
such that every item either
belongs to the Ensemble
or does not belong to the Ensemble
.
This hypothesis is trivially true in classical logic, but in constructive
logic I
needed it to prove strong constructive existential quantifier in the
statement of incompleteness.
Definition DecidableSet (A : Type)(s : Ensemble A) := forall x : A, mem A s x \/ ~ mem A s x.
The RepresentsInSelf
hypothesis restricts what the
System
T can be. The statement of essential incompleteness normally requires T
be a recursive set. Instead I use the weaker hypothesis that the set T
is expressible in the system T.
Given a system T extending NN and another system U along with a formula φU with at most one free variable xi, we say φU expresses the axiom system U in T if the following hold for all formulas ψ.
U is expressible in T if there exists a formula φU such that φU expresses the axiom system U in T.
In Coq I write the statement T is expressible in T as
Definition RepresentsInSelf (T : System) := exists rep : Formula, exists v : nat, (forall x : nat, In x (freeVarFormula LNN rep) -> x = v) /\ (forall f : Formula, mem Formula T f -> SysPrf T (substituteFormula LNN rep v (natToTerm (codeFormula f)))) /\ (forall f : Formula, ~ mem Formula T f -> SysPrf T (notH (substituteFormula LNN rep v (natToTerm (codeFormula f))))).
This is weaker than requiring that T be a recursive set because any recursive set of axioms T is expressible in NN. Since T is an extension of NN, any recursive set of axioms T is expressible in T.
By using this weaker hypothesis I avoid defining what a recursive set is. Also, in this form the theorem could be used to prove that any complete and consistent theory of arithmetic cannot define its own axioms. In particular, this could be used to prove Tarski’s theorem that the truth predicate is not definable.
A common approach to proving the incompleteness theorem is to prove that every primitive recursive function is representable. Informally an n-ary function f is representable in NN if there exists a formula φ such that
I defined the type PrimRec n
as:
Inductive PrimRec : nat -> Set := | succFunc : PrimRec 1 | zeroFunc : PrimRec 0 | projFunc : forall n m : nat, m < n -> PrimRec n | composeFunc : forall (n m : nat) (g : PrimRecs n m) (h : PrimRec m), PrimRec n | primRecFunc : forall (n : nat) (g : PrimRec n) (h : PrimRec (S (S n))), PrimRec (S n) with PrimRecs : nat -> nat -> Set := | PRnil : forall n : nat, PrimRecs n 0 | PRcons : forall n m : nat, PrimRec n -> PrimRecs n m -> PrimRecs n (S m).
PrimRec n
is the expression of an n-ary primitive recursive
function, but it is not itself a function. I defined
evalPrimRec
:
forall n : nat, PrimRec n -> naryFunc n
to convert the expression into a
function.
Rather than working directly with primitive recursive expressions, I
worked with particular Coq functions and proved they were extensionally
equivalent to the evaluation of primitive recursive expressions.
I proved that every primitive recursive function is representable in NN. This required using Gödel’s β-function along with the Chinese remainder theorem. The β-function is a function that codes array indexing. A finite list of numbers a0, …, an is coded as a pair of numbers (x, y) and β(x, y, i) = ai. The β-function is special because it is defined in terms of plus and times and is non-recursive. The Chinese remainder theorem is used to prove that the β-function works.
I took care to make the formulas representing the primitive recursive functions clearly Σ1 by ensuring that only the unbounded quantifiers are existential; however, I did not prove that the formulas are Σ1 because it is not needed for the first incompleteness theorem. Such a proof could be used for the second incompleteness theorem (Shoenfield §8.2).
codeSubFormula
is Primitive RecursiveI proved
that substitution is
primitive recursive. Since substitution is defined in terms of
Formula
and Term
, it itself cannot be primitive
recursive. Instead I proved that the corresponding function operating on codes
is primitive recursive. This function is called codeSubFormula
and I proved it is correct in the following sense.
codeSubFormula
(⌜φ⌝, i, ⌜s⌝)
=
⌜φ[xi/s]⌝Next I proved that it is primitive recursive. This proof is very difficult. The problem is again with the need to rebind bound variables. Normally one would attempt to create this primitive recursive function by using course-of-values recursion. Course-of-values recursion requires all recursive calls have a smaller code than the original call. Renaming a bound variable requires two recursive calls. Recall the definition of substitution in this case:
If one is lucky one might be able to make the inner recursive call. But there is no reason to suspect the input to the second recursive call, φ[xj/xk], is going to have a smaller code than the original input, ∀xj. φ.
If I had used the alternative definition of substitution, where all variables are substituted simultaneously, there would still be problems. The input would include a list of variable and term pairs. In this case a new pair would be added to the list when making the recursive call, so the input to the recursive call could still have a larger code than the input to the original call.
It seems that using course-of-values recursion is difficult or impossible. Instead I introduce the notion of the trace of the computation of substitution. Think of the trace of computation as a finite tree where the nodes contain the input and output of each recursive call. The subtrees of a node are the traces of the computation of the recursive calls. This tree can be coded as a number. I proved that there is a primitive recursive function that can check to see if a number represents a trace of the computation of substitution.
The key to solving this problem is to create a primitive recursive function that computes a bound on how large the code of the trace of computation can be for a given input. With this I created another primitive recursive function that searches for the trace of computation up to this bound. Once the trace is found—I proved that it must be found—the function extracts the result from the trace and returns it.
checkPrf
is Primitive RecursiveGiven a code for a formula and a code for a proof,
the function checkPrf
returns 0 if the proof does not prove the formula, otherwise it returns
one plus the code of the list of axioms used in the proof.
I proved this function is primitive recursive, as well as proving that it is
correct in the sense that for every proof p of φ from a
list of axioms Γ,
checkPrf
(⌜φ⌝, ⌜p⌝) = 1 + ⌜Γ⌝;
and for all n, m : ℕ
if
checkPrf
(n, m) ≠ 0
then there exists φ, Γ, and some proof p of φ from
Γ such that ⌜φ⌝ = n and ⌜p⌝ = m.
For any axiom system U expressible in T, I created the
formulas codeSysPrf
and codeSysPf
.
codeSysPrf
[x0/⌜n⌝, x1/⌜m⌝]
is provable in T
if m is the code of a proof in U of a formula coded by
n.
codeSysPf
[x0/⌜n⌝] is provable in T
if there exists a proof in U of a formula coded by n.
codeSysPrf
and codeSysPf
are
not derived from a primitive recursive functions
because I wanted to prove the incompleteness of axiom systems that may not
have a primitive recursive characteristic function.
The fixed point theorem states that for every formula φ there is some formula ψ such that
and that the free variables of ψ are that of φ less xi.
The fixed point theorem allows one to create “self-referential sentences”. I used this to create Rosser’s sentence which states that for every code of a proof of itself, there is a smaller code of a proof of its negation. The proof of Rosser’s incompleteness theorem requires doing a bounded search for a proof, and this requires knowing what is and what is not a proof in the system. For this reason, I require the decidability of the axiom system. Without a decision procedure for the axiom system, I cannot constructively do the search.
To demonstrate the incompleteness theorem I used it to prove the incompleteness of PA. I created a primitive recursive predicate for the codes of the axioms of PA. Coq is sufficiently powerful to prove the consistency of PA by proving that the natural numbers model PA.
One subtle point is that Coq’s logic is constructive while the internal logic is classical. One cannot interpret a formula of the internal logic directly in Coq and expect it to be provable if it is provable in the internal logic. Instead I use a double negation translation of the formulas. The translated formula will always hold if it holds in the internal logic.
The consistency of PA along with the expressibility of its axioms and the translations of proofs from NN to PA allowed me to apply Rosser’s incompleteness theorem and prove the incompleteness of PA—there exists a sentence φ such that neither PA ⊢ φ nor PA ⊢ ¬φ.
Theorem PAIncomplete : exists f : Formula, (forall v : nat, ~ In v (freeVarFormula LNT f)) /\ ~ (SysPrf PA f \/ SysPrf PA (notH f)).
Because my proof is constructive, it is possible, in principle, to
compute this sentence that makes PA incomplete. This was not done for
two reasons. The first reason is that the existential statement lives in
Coq’s Prop
universe, and Coq’s only extracts from
its Set
universe. This was an error on my part. I should have
used Coq’s Set
existential quantifier; this problem
would be fairly easy to fix. The
second reason is that the sentence contains a closed term of
the code of most of itself. I believe this code is a very large
number and it is written in unary notation. This would likely make the
sentence far to large to be actually printed.
The proof of essential incompleteness is usually carried out for Robinson’s system Q. Instead I followed Hodel’s development and used NN. Q is PA with the induction schema replaced with ∀x0. ∃x1. (x0=0 ∨ x0=Sx1). All of NN axioms are Π1 whereas Q has the above Π2 axiom. Both axiom systems are finite.
Neither system is strictly weaker than the other, so it would not be possible to use the essential incompleteness of one to get the essential incompleteness of the other; however both NN and Q are sufficiently powerful to prove a small number of needed lemmas, and afterward only these lemmas are used. If one abstracts my proof at these lemmas, it would then be easy to prove the essential incompleteness of both Q and NN.
It is worth noting the differences between this formalization of the
incompleteness theorem and Shankar’s 1986 proof in the Boyer-Moore theorem
prover.
The most notable difference is the proof systems.
In Coq the user is expected to input
the proof, in the form of a proof script, and Coq will check the
correctness of the proof.
In the Boyer-Moore
theorem prover the user states a series of lemmas and the system
generates the proofs.
However, using the Boyer-Moore proof system
requires feeding it a well-chosen sequence
of lemmas
(Shankar xii), so it would seem the information being fed into the
two systems is similar.
There are some notable semantic differences between Shankar’s statement of incompleteness and mine. His theorem only states that finite extensions of Z2, hereditarily finite set theory, are incomplete, whereas my theorem states that even infinite extensions of NN are incomplete as long as they are self-representable. Also Shankar’s internal logic allows axioms to define new relation or function symbols as long as they come with the required proofs of admissibility. Such extensions are conservative over Z2, but no computer verified proof of this fact is given. My internal logic does not allow new symbols. Finally, I prove the essential incompleteness of NN, which is in the language of arithmetic. Without any set structures the proof is somewhat more difficult because it requires using Gödel’s β-function.
One of Shankar’s goals when creating his proof was to use a proof system without modifications. Unfortunately he was not able to meet that goal; he ended up making some improvements to the Boyer-Moore theorem prover. My proof was developed in Coq without any modifications.
The second incompleteness theorem states that if T is a recursive system
extending PA—actually a weaker system could be used here—and
T ⊢ ConT then T is inconsistent. ConT is some reasonable
formula stating the consistency of T, such as
¬PrT(⌜0 = S0⌝), where PrT is
the provability predicate codeSysPf
for T.
If I had created a formal proof in PA, I would have ⊢PA “Gödel’s first incompleteness theorem”. This could then be mechanically transformed to create another formal proof in PA that ⊢PA (PA ⊢ “Gödel’s first incompleteness theorem”). The reader can verify that the second incompleteness theorem follows from this. Unfortunately I have only shown that ⊢Coq “Gödel’s first incompleteness theorem”, so the above argument cannot be used to create a proof of the second incompleteness theorem.
Still, this work can be used as a basis for formalizing the second incompleteness theorem. The approach would be to formalize the Hilbert-Bernays-Löb derivability conditions:
The second condition is the most difficult to prove. It is usually proved by first proving that for every Σ1 sentence φ, PA ⊢ φ ⇒ PrPA(⌜φ⌝). Because I made sure that all primitive recursive functions are representable by a Σ1 formula, it would be easy to go from this theorem to the second Hilbert-Bernays-Löb condition.
My proof, excluding standard libraries and the library for Pocklington’s criterion, consists of 46 source files, 7 036 lines of specifications, 37 906 lines of proof, and 1 267 747 total characters. The size of the gzipped tarball (gzip -9) of all the source files is 146 008 bytes, which is an estimate of the information content of my proof.
I would like to thank NSERC for providing funding for this research. I thank Robert Schneck for introducing me to Coq, and helping me out at the beginning. I would like to thank Nikita Borisov for letting me use his computer when the proof became to large for my poor laptop. I would also like to thank my Berkeley advisor, Leo Harrington, for his advice on improving upon Hodel’s proof. And last, but not least, thanks to the Coq team, because without Coq there would be no proof.