Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
In Mathematics, Mistakes Aren’t What They Used to Be (2015) (nautil.us)
105 points by dnetesn on Aug 27, 2016 | hide | past | favorite | 54 comments


I work professionally as an applied mathematician and I'd love if more people were to embrace proof assistant tools. That being said, I've worked with Coq a fair amount in the past and I find it really hard to prove things related to my primary field, optimization. Some of it is getting more practice with the tools. However, how proofs are generated in these tools is different than how I was trained to produce proofs and I don't find it to be a simple process to just learn how. If anyone is curious about these tools, I recommend the Coq'Art book (https://www.springer.com/us/book/9783540208549)

As an additional, somewhat controversial comment, I think these tools are important because there's quite a bit of corruption with publishing. In short, if you're well known and liked author, you often get a pass on your papers and they're not well checked for errors. I've been yelled at by editors for questioning proofs of famous people. I'm not the only one. No. I will not name names or publications. Also, although it's not universal, when some journals say communicated by so-and-so, that means that so-and-so vouched for the paper, so they published it immediately without other peer review. Is the paper right? Maybe. However, I find the practice unsettling.

Bringing the comment back to proof assistant tools, I believe they would remedy some, but not all of the problems described above. We don't have to worry if a paper got a pass because the author was well liked. When can just run the proof through the tool and check that it's correct.

Anyway, if someone has some references or tutorials for people using proof assistant tools for applied math kinds of results, I'd love to read them. Things like stability regions for ODE solvers or convergence results for stuff like Newton's method. Or, honestly, anything that helps understand the tools and methodology more easily.


> if you're well known and liked author, you often get a pass on your papers and they're not well checked for errors.

A friend who was advised by a "famous" (i.e., in his/her field) mathematician told me basically the same thing. This person unquestionably has had brilliant insights during the course of their career, most of which turned out to be correct. But (s)he has also published several false results, including one "major" false result. Some mathematicians are known to be sloppy and the field puts up with it, which seems weird to me.

Regarding Coq, never tried it, but here is my guess at what's going on. Coq seems to be mainly embraced by algebraists and people doing discrete math, less so in fields which are more analytic (optimization, probability, functional analysis, PDE, etc.) Proofs in the latter branches of math seem to more commonly invoke geometric or "intuitive" arguments which, while completely rigorous, are harder to reduce to a sequence of logical statements. Take for example "convexity implies continuity". You can basically prove this in 1d by drawing a picture, whereas the full, gory, rigorous version, though elementary, requires a little effort. Proofs in modern algebra, otoh, involve a lot of symbol pushing, which computers excel at. Just a guess.


When I worked as an RA while getting my BSCS (and for a time during my first post-college job), I spent quite a bit of time with ACL2 (a language, logic, and proof assistant using a first-order variant of common lisp). I spent a lot of time applying ACL2 to my research area (information flow analysis for secure systems), and I feel your pain re: proof strategies. It was almost like learning how to prove things again (e.g. learning new induction strategies that matched the logical framework used by the tool).

That said, there was an incredible amount of freedom in knowing that if the tool accepted a proof, it was (probably) correct. Occasionally I would stumble on to a strategy without knowing why exactly it worked, and the interactive nature of the tool helped me gain understanding by letting my apply transformations to terms interactively without having to worry about whether I screwed up some minor detail.

I'd love to see one of these tools (though, probably not ACL2 at this point - first-order logic frequently felt limiting (e.g. when proving things about parameterized equivalence relations)) used in a book covering some math topic I'd like to learn, as interactive formalism seems well-suited to self-exploration (you don't need an instructor to point out your trivial errors)


I really think that proof assistants like Coq are going to eventually be the future of math. Once they are less painful to use :(

I spent a term working as a research assistant on Coq, proving a new induction scheme for Featherweight Java classes. I came into the term quite hopeful, and already with some Coq experience, but I really don't feel like these tools are good enough to expect regular people to use. Even mathematicians.

I'm hopeful that in the distant future, we can get some AI assistance going on, so that the proof assistant can actually assist, not just check and impede.

I would like my month spent on transitivity of subclassing back :( It's really hard to use these tools as they are now.


>I would like my month spent on transitivity of subclassing back :(

Oh God, that gave me flashbacks. I've done at least one transitivity of subtyping proof where I actually got the right theorem, then found later in the module that I had to rewrite the whole thing with a graph-theory library supplying some theorems about DAGs/finite posets. It's horrible, and that piece of work still isn't done!


I've heard some from people that it's almost like we are still at 'the assembly language' level when it comes to theorem proving. We still have a great deal to build in order to be productive at higher levels of abstraction.


Analysis and the real numbers tend to be very hard to encode in constructive foundations. We still haven't got the right balance between checking for computability of the proof and all objects it hypothesizes, versus letting some of the types we hypothesize over be uncomputable and uncountably large, while requiring that propositional types (squash types) have continuous elements.


I was very interested in this as well for learning purposes. Without an instructor it's fairly easy to accidently make logical jumps and come up with wrong proofs. I figured that math is effectively a somewhat formal language for communicating concepts, so a compiler that checks errors (or even suggests corrections) could work well.

Tools like coq are great, but difficult to use (even as a programmer used to languages like Haskell) and they don't have a large set of existing proofs to work off of.

My opinion is that we need a good intermediate representation for proof tools (to share results; each new tool now needs a huge base built up to start being useful) and simpler tools to work with. It would be an enormous benefit to mathematicians and the world at large.


That'd be great, but I think the hard part about sharing representations is that many of the proof tools are based on different underlying type theories, which make it hard to go back and forth. For example Coq is based on the calculus of inductive constructions whereas Isabelle is based on HOL. Though, I'm guessing it's possible once a type scheme is fixed. Just poking around briefly, here's an SO question and answer on some of the differences between CIC and HOL (https://stackoverflow.com/questions/30152139/what-are-the-st...)


This should be possible when the underlying logics are similar (e.g. Isabelle/HOL and HOLight) but can be tricky if the logics are far apart (e.g. Isabelle/HOL and Coq).


Fair. I need to spend a few weeks with coq before I really form an opinion of how these things could be improved.


Newbie question: if one is not interested in extracting programs, but only in proving stuff, is a proof assistant like Isabelle/HOL easier to use than one based on constructive logic like Coq?


The short answer is yes. That's because (1) a classical system like HOL can use external automatic theorem provers ("sledgehammer") that are based on classical logic (most of them are), in a way that constructive systems like Coq cannot (at least currently). Moreover (2) classical proofs can be shorter and simpler, sometimes considerably so.

In practise many other dimensions also play an important role, in particular available tactics, tactic definition languages, proof editors, available decision procedures for the specific axioms one is interested in, libraries and so on.


Although Coq is using constructive logic by default, it is not limited to it. One can use classical logic results just by importing the "Classical" module. Or course one then loose the benefits of having a constructive proof (like code extraction), but if the interest is only in checking a mathematical proof then it can be acceptable. The choice and trade-off is up to the user.


That is true, but does proof automation work as well for classical reasoning in Coq as it does in Isabelle/HOL where classical reasoning is 'baked in' from the start?

As far as I'm aware, the question of moving the "sledgehammer" approach to a constructive setting is still an open problem.


Are there downsides to the HOL approach when compared to constructive systems?


Depends what you want to do.

If you are interested in program verification, then not all that much, because everything is about automation. Coq/Agda have a small edge if you want to prove something about Lambda calculus since Coq/Agda are themselves Lambda calculi, and not surprisingly most Coq/Agda work is about Lambda calculus ... The HOL school has probably a small edge in other areas.

In practise the underlying approach (LCF vs Curry-Howard and dependent types vs HOL) makes much less of a difference than automation and availablility of proof libraries.


It's hard to take something that's been proven true and turn it into something that's been proven useful without employing some constructive means.


I'm not sure I agree. For example CakeML [1] is the first fully verified self-hosting compiler for a substantial programming language, and seL4 [2] is the first fully verified substantial operating system kernel. Both were verified in LCF provers using HOL as a logic. Are you saying that the guarantees of correctness are not in themselves useful?

[1] https://cakeml.org/

[2] https://sel4.systems


If I write a program X and tell you it was proven correct, do you know that X is useful?

To know that a correct program is useful, you have to know what it does. And if you know what it does, you have constructed an implementation.

Which is to say that a nonconstructive verifier is really just deferring the 'constructive' part of its operation to the compiler and programmer. (And this is the 'hard part' I was referring to.)


I'm not sure what you mean.

We are discussing the (dis)advantages of HOL vs constructive reasoning (e.g. Coq). The main difference is that HOL-based provers like Isabelle/HOL has classical built in from the start -- all proof automation supports classical reasoning. Constructive reasoning eschews classical methods.

In order to prove the correctness of a program X, I need a specification S, and then I construct a proof P that X |= S, i.e. that X satisfies the specification S. I use an interactive proof assistant to check the proof P. There is no disadvantage that I can see if the proof P of X |= S is classical, at least from the point of view of program correctness.

Maybe you mean something else?


I'm no mathematician and this might be way out of my league, but I've wondered how Gödel's incompleteness theorems come into play if Mathematicians are using proof assistants to verify their mathematics.


The incompleteness theorem says that there exist some (paradoxical and self referencing) true theorems that you can't prove (i.e. "this sentence is unprovable".) Neither humans nor proof assistants should be able to prove them.


> there exist some true theorems that you can't prove.

Emphasis mine.

Are all unprovable theorems true? I googled "are all Gödel sentences true", but came up with nothing. Also: Are all unprovable theorems Gödel sentences or are there other kinds?

EDIT: Lots of good answers to my first question, thanks a lot to everyone. For the second one I think I also found an answer:

According to the Wikipedia article about the incompleteness theorems [1]: "The Gödel sentence is designed to refer, indirectly, to itself." So the question really is if there are non self-referential unprovable statements, and the answer seems to be yes[2].

[1] https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...

[2] https://www.quora.com/Are-there-true-but-unprovable-statemen...


I'm not a logician, but I think the question (a good one) is not quite well-posed. As I understand, in mathematical logic, a statement being provable means one can deduce it from a set of axioms. If you have a set of axioms A sufficiently powerful for Godel's construction to go through, the resulting Godel sentence s is (by construction) not provable from A, but one can always make it provable by appending s to the axioms A.

For a statement to be true, on the other hand, has a technical meaning: it means that there is a model (in the sense of model theory, see https://en.wikipedia.org/wiki/Model_theory ) in which both the axioms A and the sentence s are true. Godel's completeness theorem (https://en.wikipedia.org/wiki/G%C3%B6del%27s_completeness_th...) asserts that s is provable from A if and only if s is true in every model in which A holds.


It's a bit confusing to say that about the Completeness theorem without noting that it only applies to first order logic. It's not generally true.


Yet first order logic encompasses 99.9% of the mathematics anyone ever does, even professional mathematicians. The only people who consider anything but first order logic are people who have gone out of their way to make a point to study it just for its own sake. When the general subject of discussion is proof assistants and proofs in normal mathematics, I don't think it's necessary to constantly mention "only in first order logic". It's like discussing structural integrity of a building and constantly mentioning that all the calculations only hold on in our reality, but if we were living in an alternate dimension and the physical constants of the universe were different then the results wouldn't be applicable.


>Yet first order logic encompasses 99.9% of the mathematics anyone ever does, even professional mathematicians.

I'm way out of my depth here, but I'm curious about this statement. In first order logic you can't even assert that the natural numbers cannot be put into a one-to-one correspondence with the real numbers, which seems like a fairly basic mathematical fact.


You can, as long as your first-order statement is in the language of set theory, by translating "there does not exist a bijective function f: N->R".


Right, but you can't ensure that the only models of 'bijective functions' are actually bijective functions. To switch to a slightly simpler example, you can translate statements such as "x is finite" into a first order language (for example, set theory), but you can't ensure that x actually is finite in the models of this statement.


To your first point: yeah, some discomfort around it is valid, but I see it as a non-issue. Informally/philosophically, because a model's job is to capture everything a first-order theory can see about your object of study - so it's fine for (say) an "actually" countable set to model the reals, and not be up to the task of "actually being" the reals. Formally, you differentiate between internal and external statements and don't expect them to be the same (a rigorous version of my pretentious use of scare quotes above :).


It might be helpful to be a bit more precise here.

The expression "first-order logic" (FOL) is a bit vague. Do you mean pure FOL (i.e. all axioms are about equality) or FOL plus additional axioms? Pure FOL (without additional axioms) is indeed rather weak and cannot be used for much. That's why mathematicians build a foundation of mathematics on top of FOL by assuming the truth of additional axioms.

The most well known foundation of mathematics built on top of FOL is ZF-set theory which adds about a dozen additional axioms about how to construct new sets from old, and when two sets are equal.

There are other FOL-based axiomatisations of mathematics such as Quine's NF, or category theory (which can be expressed as a theory on top of FOL). There are also foundations of mathematics that are not using FOL (Martin-Loef type theory and successors such as HoTT).

Goedel's first incompleteness theorem forces all those axiomatisations of mathematics to be incomplete, meaning there are formulae A such that neither A nor the negation of A can be derived.

A statement like "99.9% of the mathematics anyone ever does" is about FOL extended with foundations of mathematics like ZF, or a type-theoretic equivalent.

However, mathematicians are getting ever better at producing interesting formulae whose truth (or otherwise) cannot be settled in ZF (or similar) [1]. In order to deal with such expressivity gaps, mathematicians investigate more powerful axioms, called "large cardinals", which claim the existence of certain extremely large sets. You can also think of them as more powerful induction principles.

The problem with inventing new large cardinals (i.e. more powerful induction principles) is to navigate between the Scylla of expressive weakness (the axiom is a consequence of other axioms) and the Charybdis of inconsistency (adding the axiom allows us to derive "false").

[1] https://u.osu.edu/friedman.8/


Yes, I was being sloppy. Thanks for the clarification.


In consistent systems, false theorems are always unprovable.

So it's only interesting when unprovable theorems are true.

(If you're looking for a false theorem that isn't disprovable, simply negate an unprovable true theorem.)


>So the question really is if there are non self-referential unprovable statements, and the answer seems to be yes

Some mathematicians believe that many popular conjectures are true but unprovable. Like Goldbach's conjecture, Riemann's hypothesis, or perhaps even the collatz conjecture.

We know these things are probably true. If the prime numbers are distributed randomly, there are probabilistic proofs that show the probability of goldbach's conjecture approaches 1, and there is a similar argument for Riemann's hypothesis. However it's just a "mathematical coincidence", there is no reason there can't be a counterexample at 10^500+1, it's just really improbable, so it's impossible to prove.


"Very difficult to find a counter example" =/= "unprovable"


No, the point is that there are no counterexamples. The probability of any counterexamples existing decreases exponentially, but is nonzero. So there is a 99.999...% chance that no counterexamples exist, and therefore it is true. But it's impossible to ever prove that.


> Are all unprovable theorems true?

"True" in this context means "conforms to my intuitive understanding of what true means".

Gödel sentences are an elaborate construction that express the concept "This sentence is unprovable from $AXIOMS" without using direct self-reference, but this construction only means what we think it means if the natural numbers are the "real" natural numbers.

Because no finite set of axioms can rule out all nonstandard models, you can't be sure that there isn't some "proof" corresponding to a nonstandard number.

If such a thing exists, the Gödel sentence is still unprovable (because no proof exists) but false (because it asserts that no "proof" exists, although there is a nonstandard one).


No. The definition of "undecidable" is that you can't prove it true or false. This means that if P is undecidable then so is not-P. But exactly one of these is true and the other false


This seems a little glib wrt the meaning of something being "true". We have undecidability results: the Axiom of Choice and its negation are both consistent with the Zermelo-Frankel axioms. The continuum hypothesis and its negation are both consistent with the ZFC axioms. The Parallel Postulate and its negation are both consistent with the other Euclidean axioms.

Are we justified in saying that for each of these, one must be true and the other false? Of course not. To resolve them one way or the other, you need additional axioms; none of those six propositions is true platonically. What you choose depends on the problem you're trying to describe.


Well, for one thing, you have to define 'true'. The incompleteness theorems don't have anything to do with what's 'true', they have to do with what's 'derivable'.

It's a consequence of the completeness theorem that every consistent theory has a model. Furthermore, in any incomplete system whereby a Godel sentence is useful, there are necessarily models that differ on the satisfaction of the Godel sentence.


Slightly off topic, but your comment reminded me of Tarski's paper The Semantic Conception of Truth and the Foundations of Semantics (http://www.jfsowa.com/logic/tarski.htm) Basically, it's actually kind of hard to define what true means and Tarski goes through a long discussion trying to pin it down.


If all unprovable theorems were true and you could prove that statement then you'd have proven the unprovable. So no.


from the tutorial(o):

> IT IS VITALLY IMPORTANT THAT YOU DO NOT THINK OF A Prop AS BEING EITHER "TRUE" OR "FALSE". A Prop either has a proof or it does not have a proof. Godel shattered mathematics by showing that some true propositions can never proven. Tarski went further and showed that some propositions cannot even said be to be true or false!!! Coq deals with these obstacles in modern mathematics by restricting Prop to being either proven or unproven, rather than true or false.

(o) https://coq.inria.fr/tutorial-nahas#lab9


i'll use the first incompleteness theorem itself to explain this distinction between truth and proof

..the first because the second is essentially a stronger form of the first..

> First Incompleteness Theorem: "Any consistent formal system F within which a certain amount of elementary arithmetic can be carried out is incomplete; i.e., there are statements of the language of F which can neither be proved nor disproved in F." (o)

what's important is the statements are incapable of being proved or disproved, to me this is confusing language, let me try to show the possibilities

a statement can be:

    o.   proved  : there exists at least one complete collection of validating evidence 
    i.   disproved : there exists at least one complete collection of confuting evidence
    ii.  incapable of being proved  : there only exists incomplete collections of validating evidence
    iii. incapable of being disproved : there only exists incomplete collections of confuting evidence
..think of complete and incomplete in terms of russell's paradox(i): a set containing all sets; any proof would be incomplete because the evidence for the set containing all sets must also contain itself because it is a set; creating a subsequent requirement for evidence that this new self contained set is also contained in the set of all sets, ad infinitum, perpetually and subsequently incomplete..

the first incompleteness theorem is making a claim for the existence of ii and iii

if this statement is true how could you prove it?

if the statement is true it means you would need to prove that specific statements in the language of F are incapable of being proved or disproved, but if the statements themselves are incapably proved or disproved how can you express through evidence that your proof is even working on those same statements

a theorem is proved or disproved when one can express through finite steps, or statements of evidence, a proof

if a theorem has an incomplete proof or disproof how could i prove said incomplete proof?

but we know the statement to be true because that is how it essentially self references

by stating there are statements being incapably proved or disproved is itself incapable of being proved or disproved

making the statement true, but without proof of evidence

..except some strange brain itching intuition :P ..

but now we have left the domain of arithmetic and entered the domain of semantics(ii)

and here is where tarski comes in with his undefinability theorem{iii)

..is it irony that undefinability is underlined by this spell checker?..

coq is stating here that it is itself a semantic domain only dealing with arithmetic statements that can be proved or disproved

implying imperatively to avoid the incomplete and undefined because they explicitly and by design exist outside of this proof assistant's restricted domain of interest and so will be inexpressible

naively, if you wrote a program with the restriction of only computing the addition of positive integers; if you wanted to add using a negative integer you would have to write a new program

which going back to your original question:

> I've wondered how Gödel's incompleteness theorems come into play if Mathematicians are using proof assistants to verify their mathematics

they come into play only by being the very things that are disallowed in the spec from coming into play

in coq at least, you are required to work without them because it is written to only play with propositions that can be proved or disproved

and we showed above they are unable to be proved or disproved

(o) https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...

(i) https://en.wikipedia.org/wiki/Russell%27s_paradox

(ii) https://en.wikipedia.org/wiki/Semantics

(iii) https://en.wikipedia.org/wiki/Tarski%27s_undefinability_theo...


There's a relationship between the incompleteness theorems and the halting problem in that no turing machine can decide whether an arbitrary statement in a logical system where the incompleteness theorems apply is derivable or not. Equivalently, there's no turing machine with input 'statement in the logical system' and outputs a proof or a "unprovable".

However, once you already have a proof, checking is much easier (at the formal level). Modern mathematical proofs, however, are not generally written in a fashion in any way similar to how the formal system operates, so applying this in practice is non-trivial.


This problem actually goes back to Descarte's proofs-of-god: in order to solve some problems Descarte needed infinite computational power. Guess who has infinite computational power?


I suppose you could assert that anyone you like has infinite computational power.


Here's one interpretation of Gödel's first incompleteness theorem that may help.

First, let's look at computability. The intuitive idea that some functions can be computed by an algorith is very old (think Euclid's algorithm). The idea was formalized in the 20th century by the notion of a computable function. Several different definitions were given, but surprisingly, they turned out to be equivalent. Today, most programming languages are Turing complete. As far as pure computation is concerned, it is possible to translate between different programming languages.

The idea that a mathematical statement can be conclusively proven is also very old. Again, different formal definitions emerged: provable in PA, provable in ZFC, provable in ZFC + some large cardinal axiom, etc. These are not equivalent. And this, in a sense, is one way of framing Gödel's first incompleteness theorem. There is no absolute notion of provability: no equivalent of Turing completeness; no precisely definable concept that would deserve the name "provable".

In particular, this means that it is not always possible to translate proofs from one system A to system B. At least unless you are willing to add new axioms to system B.


Proof assistants are frequently used to prove the correctness of computer programs. And since a proof assistant is a computer program, and often quite a complex one that could contain bugs that render everything it does useless, you might want to use your proof assistant to prove that its code does what you hope it does.

Goedel's second incompleteness theorem says you can't do that. Ever. And more generally, it says that you will hit a roadblock as you try to prove what it is that your proof assistant does. This is really annoying.

But you can get pretty damn close. HOL Light has something of a self-consistency proof that is so close to the real thing that it's worth taking seriously.

But incompleteness, by itself, isn't a big deal for theorem provers. Russell himself worked as if the existence of infinities wasn't provable, which is why Goedel had little relevance to Russell. Hilbert on the other hand....


Seems like you can formulate it using a proof assistant. See http://www.academia.edu/4271278/A_Mechanised_Proof_of_Gödel_...


Nice article. Two incidental notes: (1) Siobhan Roberts is an engaging speaker, if you get the chance to hear her speak, it's worthwhile. (2) This reminds me of http://calteches.library.caltech.edu/51/2/CargoCult.htm .


>> Acknowledging something as true because the computer says so is not the same as knowing why it is true. One might reckon it’s analogous to relying on an Internet mash-up of reviews about the mysteries of Venice, rather than going there and splurging on the water taxi and experiencing the magic for oneself.

Water taxis are not a very efficient way to explore Venice. You need to get off at some point and walk around the streets. There are a thousand little nooks and crannies to investigate, that a water taxi can't reach. When you need a boat you can get by most of the time with the vaporeto, which can be quite cheaper especially if you get a day ticket. There's lines for most of the islands around the city (Murano, Burano, St Michele, Torcello etc).

Er. That's all to be taken entirely literally, from my experience as a frequent visitor to Venice. I'm not sure how it applies as a metaphor to mathematics research.


I like this quote:




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: