Introduction
1

One of the chief triumphs of modern mathematics consists in having discovered what mathematics really is.

– Bertrand Russell

8module intro

Watson is a proof assistant. To be more technical, and more opaque, Watson is a metalanguage for discussing formal languages with context free grammars, bound variables, and Fitch-style, natural deduction proof systems. Even if you are familiar with other proof assistants, that description might sound foreign. This is not surprising; Watson is quite different from other proof assistants. To understand exactly what Watson is, we need to discuss what mathematics really is.

History 1.1

In the early 1920s the influential mathematician David Hilbert proposed what later became known as Hilbert's Program. The ambitious goal of this program was to prove the consistency of all of mathematics. (That is that no two contradictory statements could both be proven true). In order to prove mathematics consistent, logicians needed a rigorous description of what mathematics is. Specifically, if they wanted to show that for all mathematical statements it is not possible to prove that both the statement and its inverse are true, they would need definitions of mathematical statement, prove, and even true.

It was this desire for definitions of these concepts that lead Russell and Whitehead to write the expansive Principia Mathematica. The Principia was the first successful attempt to give formal descriptions of these concepts and to use those definitions to describe a small amount of the world's preexisting mathematics. Although the system described by the Principia is no longer in use, its descendants are first order logic and ZFC set theory which are the modern foundations of mathematics.

Ultimately, Hilbert's Program lead Kurt Gödel to prove his famous incompleteness theorems. These showed that the Hilbert's dream of proving all mathematics consistent could not be achieved. Despite his inability to prove mathematics consistent, Hilbert's Program still lead to the development of the system of formal languages which we now use today.

Formal Languages 1.2

Of the three fundamental concepts we need to define, mathematical statement, prove, and true, the one we need to address first is what exactly is a mathematical statement? It is only once we know what a statement is that we can ask whether it is true and how to prove it. It is the formal language that allows us to define our statements.

In the paradigm of formal languages, a mathematical statement is a sequence of symbols. These symbols might include ++, \cdot, ==, \forall, \exists, the digits 00 to 99 or any other symbol at all. For example, we might discuss the formal language of arithmetical statements. In this language we would have that 4+2=614 + 2 = 6 \cdot 1 and 1+1=31 + 1 = 3 are examples of mathematical statements. However, not every sequence of symbols is necessarily a mathematical statement. For example, +2=+ \cdot 2 = - is a sequence of symbols that is not a valid arithmetic statement. If a sequence of symbols is a mathematical statement we call it well formed. A formal language is a set of symbol sequences that are considered well formed.

At the level of formal languages we do not need to discuss the meaning of mathematical statements nor whether they are true or false. Notice, for example, that 1+1=31 + 1 = 3 is a well formed statement even though it is false. We may even define a formal language without any idea of what its statements might mean. For example, we could define a formal language where \triangle and \square are the only symbols and the well formed statements are those in which every \triangle is followed by at least two \squares. It is unclear what mathematical meaning this language might have, but it is a formal language nonetheless. We will discuss later how it is that we imbue meaning onto these languages.

In order to define which symbol sequences are well formed, we will use a tool from Computer Science: the context free grammar. A context free grammar defines the set of well formed statements using non-terminals and productions. A non-terminal is any symbol that does not appear in any of the well formed statements of the language we are defining. A production is a transformation from a non-terminal into a new sequence of symbols. What exactly this means and how this helps us define a formal language is best illustrated by an example.

Suppose we wish to define the formal language of arithmetical statements that we described earlier. If we were to describe informally the well formed statements of this language we might say that every statement consists of two numbers separated with an equals sign. If we imagine that NN denotes an arbitrary number we might write that every well formed statement takes the form N=NN = N. Then we might say that in this language a number is either a digit or a binary operator applied to two numbers. So we could say that if we see an NN then it could be replaced by a digit or by N+NN + N or by NNN \cdot N. This description is a context free grammar! NN is a non-terminal and our rules that we can replace NN with a digit or binary operation are the productions. The well formed statements are those produced by starting with N=NN = N and applying the productions until there are no more NNs remaining. For example, we could produce the statement 1+1=31 + 1 = 3 with the following steps:

N=NN+N=N1+N=N1+1=N1+1=3. \begin{align*} N = N &\Longrightarrow N + N = N \\ &\Longrightarrow 1 + N = N \\ &\Longrightarrow 1 + 1 = N \\ &\Longrightarrow 1 + 1 = 3. \end{align*}

In general, a context free grammar has non-terminals, productions, and a start sequence. Starting with the start sequence you replace non-terminals using the production rules until no non-terminals remain. The well formed statements of the formal language are exactly the statements that can be produced from this procedure.

It turns out that context free grammars are sufficient to describe the formal languages of most mathematical systems we might want to use. The main exception is statements with bound variables, such as n,n+0=n\forall n, n + 0 = n. (Preventing ill-formed statements of the form n,m+0=m\forall n, m + 0 = m requires the "context" that context free grammars are free of). There are other, more powerful ways of describing formal language but context free grammars are simple, elegant, and almost always sufficient.

Watson allows us to describe a context free grammar for our mathematical statements. Additionally, it adds an extension to the system of context free grammars that allows it to handle bound variables. This explains what is meant in the introduction by Watson being for "formal language with context free grammars and bound variables."

Truth 1.3

Having defined mathematical statements the next question to answer is what it means for a statement to be true. There are broadly two perspectives we can take:

The first is what you might expect. When we have a statement like 1+1=21 + 1 = 2 it is talking about objects that exist in some sense. There is some object out in the world that we give the name 11. There is another object we call 22. There is an operation we call ++ that will take in two object and return to us a new object. The statement 1+1=21 + 1 = 2 is true because when we hand the objects 11 and 11 to the operation ++ what we get back is exactly the object we call 22. And the statement 1+1=31 + 1 = 3 is false because the objects named 22 and 33 are distinct.

From this perspective, statements are either true or false because they ask meaningful questions about the way the world is. We call the world names like 11 and 22 refer to objects in and statements like 1+1=21 + 1 = 2 ask questions about a model. For simple mathematical systems like arithmetic or Euclidean geometry the models can be constructed from the objects of more advanced mathematical systems. But what are the models of those more advanced mathematical systems? We can continue to construct more and more complex mathematical systems but if we want this chain to ever end we need to find some other model. We cannot use the real physical world as our model because even fairly simple mathematical systems contain non-physical objects like infinities.

Sadly, at some point we must admit defeat and admit that we have no coherent way of describing the model of our mathematical system. We can discuss certain properties it must have, but we will never be able to give a complete description of it.

However there is another, far simpler, perspective. Under this perspective we do not claim that mathematical statements discuss "real" objects or ask meaningful questions. We simply declare that some of the statements are "true" and the rest are "false." Here true and false are arbitrary labels; we might as well split the statements into red and blue, cat and dog, or kiki and bouba. It might seem unsettling to make truth such an arbitrary thing. But things don't have to be entirely arbitrary. We can declare certain rules. For example, we would want that if a statement pp is true then ¬p\neg p should be false. Or we could have that if pp and qq are both true then pqp \land q should also be true. Unfortunately, Gödel proved that for a sufficiently complicated idea of truth no finite set of such rules suffices to determine the truth value of every mathematical statement. So we will have to leave some truth values unspecified.

This might still seem uncomfortable, however notice that we are in no worse a situation then we were when we were using models! Just as we had to leave some truth values unspecified we are unable to describe certain parts of our model and thus unable to determine truth values relating to those parts of the model. Ultimately, this is simply a truth of mathematics: formal systems do not allow us to know exactly what is true.

As a proof assistant, Watson takes no perspective on the truth of the mathematical statements it deals in. It is for you to decide what the statements mean and whether they are true. Describing a model or a truth function is no simple thing, and, as discussed, not even possible to fully do. So Watson does not contain any features for such activities. Instead, Watson deals only in proofs which, as will be discussed in the next section, are distinct from truth.

Proofs 1.4

We now come to the final of our three questions. What is a proof? Simply put, a proof is an argument that a statement is true, for whichever definition of true you would like to use from the previous section. Once again we run into definition difficulties. What exactly is an "argument"? The concept cannot be defined mathematically because we are attempting to use it to define mathematics! However, unlike with models, here the real world is a useful resource. A proof is an object that exists in the real world and not an abstract mathematical entity. If a proof system, i.e. a set of allowable proofs, consists only of proofs for true statements we call it sound.

With this we have finally discovered exactly what Watson is. Watson produces real world artifacts that can be interpreted as proofs for certain formal language systems. Specifically, Watson produces a form of proof known as Fitch-style proofs. In a Fitch-style proof system all the steps of the argument take the following form: since we know p1,p2,,pnp_1, p_2, \dots, p_n to be true, we can deduce that qq is true. Watson allows you to declare deductions of this form to be valid a priori. (Usually if n=0n=0 such deductions are called axioms and if n>0n>0 they are called rules of inference. But Watson makes no distinction and calls both axioms). By applying these deductions we can create new deductions. For example, the classic rule of inference modus ponens tells us that if pp and pqp \to q are known to be true then qq can be derived. By applying this inference twice we can produce the new inference that if pp, pqp \to q, and qrq \to r are all known to be true then rr can be derived.

Fitch style proofs also include one other feature: natural deduction. Returning to the same example, we would like to be able to prove the deduction that if pqp \to q and qrq \to r are true then prp \to r is true. Although there are additional axioms we could add that would allow us to prove this the "natural" way of reasoning uses only modus ponens. We would like to be able to "assume" pp, from this deduce qq and then rr. Since assuming pp allowed us to deduce rr we would then like to conclude that prp \to r. This form of reasoning is called natural deduction and Watson allows us to express this form of reasoning as an axiom.

You might be wondering at this point if the proofs Watson produces are sound. The answer is that the question is independent of Watson. It is only by knowing the intended interpretation of truth in the formal language you are using Watson to describe that the proof system can be called sound or unsound. But what Watson does guarantee is that if the deductions you mark as axioms are individually sound, then the proofs you generate with them will also be sound. It is in this sense that Watson proofs are correct.

Why Watson? 1.5

There are currently many other proof assistants. What makes Watson unique? The most fundamental difference is that Watson is not based on type theory. If you are familiar with other proof assistants you may have been surprised to see that our discussion of the background for Watson made no mention of type theory. Alternatively, if you are not familiar with other proof assistants you may be surprised to learn that most proof assistants are not at all about formal languages. Instead, most proof assistants are actually programming languages.

There is a beautiful connection between Computer Science and mathematics known as the Curry-Howard Isomorphism. Roughly speaking, it says that every program in a typed programming language is a proof. By creating programming languages with powerful type systems the proofs corresponding to programs in these languages can become quite interesting and even correspond to proofs about useful questions from mathematics. This is the system underlying most proof assistants, including some of the most popular ones such as Lean and Coq. It is no accident that the Curry-Howard isomorphism underlies so many proof assistants, it is both theoretically beautiful and immensely practical. Programming languages are a good way of describing a detailed series of steps, like a proof.

However proof assistants created in this way feature a style of mathematics that is not the same as math as it is traditionally presented. Even just at a first blush, most mathematicians would tell you that their foundations are set theory, and Curry-Howard based proof assistants are not based on set theory. That is not to say that the mathematics of CH based proof assistants is wrong or worse. But it is different. For the purposes of proving cutting edge theorems these differences are mostly unimportant. But if one wants to use proof assistants as a way to understand foundations and the low level details of proofs one would see during a typical undergraduate math education, CH proof assistants are not ideal.

Although most proof assistants are based on the Curry–Howard isomorphism, there are two widely used systems that are not: Mizar and Metamath. Mizar is a proof assistant for a fixed formal language, namely first-order logic. It provides a polished environment for writing proofs in that setting and has accumulated a substantial library of formalized mathematics. Its design choices, however, limit its applicability outside that context. Because Mizar is tailored to a single underlying logic, many of its features do not readily generalize to other formal languages. Additionally, the Mizar proof checker is not open source.

Metamath like Watson, allows the user to define proofs over arbitrary formal languages specified by context-free grammars. Unlike Watson, Metamath does not natively support natural deduction, and proofs are therefore typically written in a style that requires explicit management of assumptions and scopes. In addition, Metamath provides little in the way of proof automation, so proofs often include many low-level steps that are conceptually routine but must be written out explicitly.

The goal of Watson is to make it possible to write proofs over arbitrary formal languages in a concise, convenient, and human readable manner akin to how mathematicians write proofs in practice.

This Book 1.6

This book contains a development of as much of mathematics as we can manage. For this development we employ the commonly understood foundations of mathematics: set theory based on first order logic with the Zermelo-Fraenkel axioms. This book can serve as a reference for how set theory can be used to define mathematics as well as for a completely rigorous treatment of all the topics it discusses. As discussed previously, other proof assistants are less useful for this purpose than one might initially hope.

Additionally, this book serves as a guide to the Watson language. As we use the various features of Watson we discuss how they work and why we have chosen to use them in the way we have. Reading this book should make it clear how Watson could be employed to write proofs for other formal languages. Given an understanding of traditional mathematics, this introduction is all the additional background you should need to understand how Watson works.

In the next chapter, let us begin.