There is a series of posts at the Timothy Gowers’s blog about software that can “solve mathematical problems” which he has been working on with Mohan Ganesalingam for the last three years. The word “problems” is used here in the way a mathematician would use it – it’s really about software that can support a mathematician trying to write a proof of a theorem. The goal is for the program to write proofs that look like ones produced by a human. The experiment Gowers sets up on his blog presents several theorems with proofs written by two humans (an undergraduate and a graduate student) and one created by the program and asks the readers to guess which proofs are written by whom. The results are truly remarkable – while I, and most of the audience, could have a good guess which proof is written by the software, I believe I could guess it only because I had known one of the three proofs was such. If a student submitted such proof as homework to me, I would have no suspicion at all.
There is not much information on how the software works (it’s written in Haskell btw.), but one stands out for me:
One thing that humans do not do is rewrite all their statements in low-level language and work directly from the axioms of set theory. So our program doesn’t do that either.
So this is not a tool for formalized mathematics. I have always considered such attempts of doing mathematics on a computer that try to emulate the way humans think and are not based on solid formal foundations as a dead end. However after some thinking I do see that the software may have useful applications the same way Computer Algebra Systems have. In its ultimate form the program would be equipped with a large data base of mathematical facts (the CYC style) and perhaps a modular library of tactics (for the lack of better word) that emulate various approaches humans apply to solving problems. Using that, the program will be able to rapidly search for proofs that may or may not be entirely correct, but still very convincing for humans. There will be two sources of correctness errors. First of all without formal foundation the meaning of the facts in the database will be fuzzy. Second, if the facts in the database are input by humans they may be syntactically OK (that can be checked by the machine), but semantically wrong. People just make mistakes and the number of such mistakes can be reduced only by a an extensive human review process which would be very costly to set up on a larger scale. Most of the errors will be easy to spot, because the program can write proofs in a clear style that is easier for a reviewer to check. However, some errors can be very difficult to notice since generally the proofs will sound convincing (a design goal) and almost all of them will be correct.
Of course the hope is that the program can be used not only by undergraduate math majors cheating with their homework, but also by serious researches trying to come up with new theorems. I still think that the chances for that are rather small.