Quasi-empirical formalism and rationality

Note: I wrote this in May 2003. It's unfinished but still of some interest I think. 

Formalist philosophies of mathematics isolate themselves from foundational problems whilst retaining something like certainty. Unfortunately, the side effect of this is that they cannot account for the meaning or practice of mathematics. They cannot account for the meaning of mathematics because for formalism the symbols and rules of inference are purely syntactic and both the choice of axioms and the choice of which propositions to prove are a matter of convention alone. They cannot account for the practice of mathematics because, with a tiny number of exceptions, no field of mathematics lives up to its standards of rigour. The function of formalist philosophies of mathematics is to make mathematics “anarchist” (as Feyerabend would have put it) or “elitist” (as Lakatos would have put it). That is, to bar non-mathematicians from judging mathematics. This probably accounts for its popularity among mathematicians.

A more charitable view is that formalism is a regulative ideal for mathematics. It is an ideal because if we had the capacity to work with mathematics purely formally, and we had the capacity to manipulate symbols at great speed without error, then the only uncertainty in mathematics would be about whether the axioms led to contradiction or not. Certainly this would still leave an empirical question open, and would not constitute a theory of meaning but it would provide a perfect court of arbitration for proofs. This view is expressed by David Hilbert in his essay On the Infinite: “Mathematics in a certain sense develops into a tribunal of arbitration, a supreme court that will decide questions of principle – and on such a concrete basis that universal agreement must be attainable and all assertions can be verified.” This ideal is unattainable, but it is also regulative in the following sense. In practice we can only work with informal proofs which we believe to be formalisable. If there is a dispute over a theorem, the ideal of formalism suggests a relatively unambiguous way to resolve the dispute, or at least to find where the disagreement lies if it cannot be resolved. Suppose I dispute an informal step in your proof, you can respond by further formalising that step (assuming you didn’t make a mistake), getting closer to the axioms of whatever formal system you’re working in. I can then read over your formalisation of that step and either agree that you were right after all or criticise a step in your more formal version. This process either ends with agreement (either you made a mistake or I did) or it ends in a fundamental disagreement about the axioms. Either way, we know, in principle, where the problem lies. So, if we consider formalism as a regulative ideal for the resolution of mathematical disputes it has methodological significance for mathematics and is unobjectionable in the sense that it doesn’t claim to be a theory of meaning.

The concept of a regulative ideal has a precedent in epistemology which suggests its usefulness. In his paper Truth, Rationality and the Growth of Knowledge, Popper used a modification of Tarski’s theory of truth, and his own notion of closeness to truth, as a regulative ideal in defining science. He defined closeness to truth in the following way: A theory S is closer to the truth than a theory T if the set of true consequences of S is a superset of the set of true consequences of T and the set of false consequences of S is a subset of the set of false consequences of T. This defines a partial order on theories. However, he recognises the ideal nature of his definition: “… we have no criterion of truth, but are nevertheless guided by the idea of truth as a regulative principle (as Kant or Peirce might have said); and that … there are criteria of progress towards the truth…”. He later goes on to say “It is only the idea of truth which allows us to speak sensibly of mistakes and of rational criticism, and which makes rational discussion possible… Thus the very idea of error – and of fallibility – involves the idea of an objective truth as the standard of which we may fall short. (It is in this sense that the idea of truth is a regulative idea.)”. And, although he has said that there are “criteria of progress towards the truth” he later goes on to say that “… approximation to truth … has the same objective character and the same ideal or regulative character as the idea of objective or absolute truth.” In other words, there are no such criteria, as he recognises when he says “It always remains possible, of course, that we shall make mistakes in our relative appraisal of two theories, and the appraisal will often be a controversial matter.” He cannot directly deduce methodological rules from these ideal definitions, but it suggests (regulates) them. This can be more clearly seen in Lakatos’ definition of what it means to falsify a theory. For Lakatos a theory T is falsified if there is another theory, S, which predicts (explains) all the corroborated content of T and correctly predicts at least one fact that T fails to predict. Since closeness to truth is only a partial order on theories, there remains a logical possibility that two theories may not be comparable (if S correctly explains a fact that T fails to and T correctly explains a fact that S fails to). Kuhn and Feyerabend showed that this sort of thing happens all the time. This undermines the idea that Popperian or Lakatosian science are complete philosophical theories of science, but doesn’t undermine their status as regulative ideals.

In fact, there is a sense in which philosophical theories can never be more than regulative. The written word obviously has no mystical power that can, in itself, cause things to happen (except for magic spells of course). An individual is free to interpret a philosophical theory however he likes, especially if it involves such vague notions as truth.

But how can we discuss the relative merits of different regulative ideals if philosophical theories can never be more than regulative? This is a difficult question and in general it is probably impossible to do so abstractly. However, if we think about the function of rational discussion and discuss regulative ideals pragmatically we might be able to get somewhere. As an example, consider the regulative ideal of objective truth. I believe that the reason that it is so fruitful in practice can be understood as a phenomenon something (very roughly) like a generalised dialectical process.

Consider the following model scenario. There are two individuals in a shared environment. Each individual has his own heuristic approach to dealing with the world. That is, he has rules of thumb that help him to interact and deal with the environment he finds himself in. The two individuals can communicate with each other, in particular they can, to the best of their abilities, describe their heuristic approaches to one another. Through the process of arguing together about which heuristic approach is better, and by trying out and testing each others approaches on the environment, there emerges an improved approach which better deals with the environment than either of the two individuals’ initial approaches. From the first individual’s point of view, his initial approach is something like a thesis, the second individual’s is like an antithesis, and the better approach they reach together is like a synthesis.

This scenario is simplistic, but it suggests a more sophisticated and general one. Firstly, the two individuals will probably not be able to explain their own heuristic approaches perfectly. Secondly, they probably will not come to an agreement about a third approach, although both of their own approaches will probably be modified by the exchange and there will be an overlap between their modified approaches. Now, increase the number of individuals in the scenario and, over time, the clash of heuristic approaches will tend to make the individuals’ approaches more and more effective. In this account so far there is in one sense an objective fact, the shared environment. However, it may easily be that no individual can describe this shared environment. No statement about the world which an individual in that environment could think of would be true. [Suggest a variety of reasons that this might be the case.] Despite this, by using such statements about the environment as the basic mechanism of communicating their heuristic approaches, they improve, in some sense, these approaches. This scenario suggests why a belief in the possibility of statements about the world expressing objective truths can be so fruitful in practice, even if it is totally mistaken (or even meaningless). [Make some remarks about the apparent meaninglessness of the idea of a proposition expressing a truth about the world. At the very least, it’s implausibility.]

Before I go on to explain my view of mathematics, I want to quickly describe Lakatos’ quasi-empiricism since my view is related to it. Lakatos likened mathematics to science, in that it is empirical in a certain sense. The first sense in which it is empirical is that we can never know if a formal mathematical system is consistent or not. A logical falsifier of a formal system is a proof of the statement pnot-p (i.e. a demonstration of the inconsistency of the system). More interesting is his notion of a heuristic falsifier. If a formal mathematical system is a formalisation of a previously existing informal mathematical system, then a heuristic falsifier consists of an informal proof of the statement p in the informal system, together with a formal proof of the statement not-p in the formal system. This is a falsifier because it says that the formal mathematical system is not an accurate model of the informal system. For example, if in our formal mathematical system we could prove that 2+2=5 this would be a heuristic falsifier (even if the model of arithmetic which proved it was perfectly consistent). Finally, he encourages formalisations which are, in some sense, testable. For example, he mentions that Gödel says that there are consequences of certain infinity axioms in the field of Diophantine equations. together with a proof of the statement

For Lakatos, that is all that quasi-empiricism amounts to, formal mathematics is just a model of informal mathematics: “… we should speak only of formal mathematical theories if they are formalisations of established informal mathematical theories.” However, I think that more can be said about the way in which we choose propositions to try and prove from among the vast array of propositions which can be proved. The definitions and theorems of mathematics are chosen with exquisite care. Indeed, if they were not so chosen mathematics would just be a rather peculiar past-time, a bit like fitting together oddly shaped coloured bricks without any particular scheme. The difficult question is – why are they chosen the way they are? Lakatos provides a first approximation of an answer to this question - formal mathematics is a model of informal mathematics. But this leaves the question, why is informal mathematics the way it is?

First of all, mathematics is a model of our intuitions. For example, arithmetic is a model of an intuitive process which everyone is perfectly familiar with (but it is a learned concept). Our direct intuition of whole numbers, that is our experience of counting objects and comparing collections of objects, only extends to very small whole numbers, probably less than 20 or 30. Our indirect intuition of whole numbers, that is our experience of calculating with numbers, extends much further. Our direct intuition of whole numbers suggests the model we use (say, Peano’s axioms of arithmetic), and our indirect intuition, our experience with calculating using this model, doesn’t give cause for dissatisfaction.

As well as being a model of our intuition about arithmetic, mathematics provides the language and basic concepts used to define physics – real numbers, functions, differentiability, etc. The definition of real numbers in some sense captures our intuitive notions of space. Think of the square root of 2. The discovery that 2 doesn’t have a rational square root led the Greeks to abandon their arithmetical notions of number in favour of geometrical ones. The following two thought experiments suggest that there “ought to be” a square root of 2. First, it seems as though we can approximate the square root of 2 as closely as we like with rational numbers. That is, we can find rational numbers p/q for which (p/q)2-2 is as small as we like. This sort of thinking, although not directly, underlies the definition of real numbers as the Cauchy equivalence classes of sequences of rational numbers. Second, for any rational number p/q you can decide if it is less than the square root of two or bigger than it (just decide if (p/q)2<2 or (p/q)2>2). If you think of the number line, we have a way of breaking it into two pieces, one to the left of the square root of two and one to the right. It seems as though if we can break the number line at this point, there must be a point on the number line here. This thought experiment underlies the definition of real numbers as Dedekind cuts. However reasonable these thought experiments seem, they make intuitive hypotheses about numbers, and these hypotheses determine a formal model of the concept of a number. In the first thought experiment, the hypothesis is about limits, in the second thought experiment the hypothesis is about the continuity of the number line. As the example of hyper-real numbers indicates (the hyper-reals are an extension of the reals which allow for two hyper-reals to differ by an infinitesimal quantity, amongst other things), these intuitive hypotheses can be modelled in more than one way. The definition of the Cauchy equivalence classes of sequences of rational numbers takes no account of the exact way in which the sequence of rational numbers converges, but the equivalent definition underlying hyper-real numbers does. For real numbers, there is a single number that cuts the rational numbers into two classes, but for hyper-real numbers there are an infinite number of numbers that cut the rational numbers in the same way. Thus our definition of number is based on an intuition about the number line, an intuition that future physical theories may cause us to reassess.

The mathematical model of the concepts of physics doesn’t stop there. Even supposing we have a satisfactory definition of the number line, this doesn’t uniquely determine a definition of space. The first step in the modern definition of space is to consider the Cartesian product of the number line with itself three times (to get three dimensions). That is, a point in space is identified with a triple of numbers on the number line (x, y, z). Again, this is a model of space. (At this point I’ve not said anything about the geometry of this Cartesian product.) The next step is to give this Cartesian product a geometric structure, by defining the notion of distance and straight lines. Initially, Euclidean geometry was the model, but later other (curved) geometries were introduced. The last step in the model is the notion of a manifold, which is turn based on notions of continuity, topology, differentiability, and so forth. At each step, models inspired by intuition have been made. [I’d quite like to go through this in more detail highlighting the way in which these steps are modelling steps.] Although the structures formed in this way may not be false in the sense that one can derive contradictions in this system, they may fail to provide useful basic concepts for physics. It is in this sense that mathematics is a model of the basic concepts of physics.

More mysteriously, mathematics can even be a model of itself. The modern concepts of real numbers, Cartesian products and so forth didn’t exist for the ancient Greek mathematicians, and yet Euclidean geometry can be modelled in this language of modern mathematics. This reflects the fact that mathematical structures acquire a meaning independent of what they’re modelling, and become interesting in themselves. [Does this paragraph make any sense?]

In practice then, the definitions and theorems of mathematics have, until recently, been motivated by the idea that mathematical definitions can get at the essence of reality. For the same reasons that the belief in objective truth is so fruitful, the belief, among mathematicians, that their definitions capture the essence of the world has been fruitful. [Proof generated concepts suggest that this is not the last word on this subject. They are motivated by the need for mathematicians to be able to say something?]

I find the term heuristic falsifier slightly misleading because it doesn’t emphasise that we are dealing with the process of rational discussion rather than an isolated judgement of the correctness of mathematics. [Emphasise this point more throughout the essay, it is what distinguishes my view from Lakatos’.] Nonetheless, I will use it in describing a few concrete examples to illustrate the quasi-empirical view of mathematics.

The simplest sort of heuristic falsifier would be an arithmetical one, e.g. if our model implies 2+2=5 or some such. The Banach-Tarski paradox is a disputable heuristic falsifier, because our intuition does not really say anything about the sort of sets involved in it. The interaction between physics and mathematics suggests a more interesting hypothetical heuristic falsifier. Imagine that, for example, a new theory of physics was based on an alternative characterisation of the concept of number or space. For example, suppose that hyper-real numbers (the hyper-reals are an extension of the reals which allow for two hyper-reals to differ by an infinitesimal quantity, amongst other things) were used as the basis of a theory of physics that empirically proved itself useful. Or suppose, like in Jorge Luis Borges’ story Blue Tigers, that in the real world numbers started to behave oddly. [ reread story and put something more in here ]. In the former situation, although our definitions and theorems about the real numbers would remain intact, a great deal of interest would go towards developing the theory of hyper-real numbers. Our model of “number” would change from “real number” to “hyper-real number”. Or, suppose that some theory in physics actually managed to use the Banach-Tarski paradox to deduce a physical consequence which was falsified. The traditional view of mathematics still couldn’t decide between AC and the rejection of AC, because the error could still be in the physics and not the maths. The quasi-empirical view, however, would suggest that the error was in the way that mathematics is like a formal model. That is, that the real numbers (in a system with the axiom of choice) were not an adequate model of the numbers needed for that particular physical theory. [Rewrite this last bit.]

Quasi-empiricism can also throw some light on how mathematical conjectures are made. One of the Clay Mathematics Institute’s Millennium Prize Problems offers one million dollars for a proof that the Navier-Stokes equations always have a smooth solution. Looking at the Navier-Stokes equations [put them in?] it might seem unclear as to why mathematicians believe there is a solution and why they think it so important to prove that there is one. The reason is that mathematicians believe that the Navier-Stokes equations govern fluid flow. Since we believe that the relevant mathematical concepts and equations do indeed accurately model fluid flow, and since we believe that it can never happen that the universe would cease to exist because there is no solution to an equation, we must believe that these equations always have a solution. A counterexample to this conjecture would be a heuristic falsifier, whereas a proof of the conjecture would justify studying the Navier-Stokes equation.

[Say something about the effect of other intellectual disciplines on mathematics. For example, computer science is interested in graphs and networks, economics stimulates interest in stochastic calculus, etc.]

[Say something about how people desire unity in mathematics. This has a pragmatic function. Talk about how the interdeducibility of different mathematical systems means that my take on quasi-empiricism doesn’t mean that mathematics will fragment.]

[Talk about computer proofs. If an artificial intelligence came up with a human style proof would we believe it? What if 10 differently programmed such programs checked an unreadable long, but human style, proof over and agreed with it? Would we still be inclined to disbelieve it even though many proofs in maths are only checked by a few people? If maths is a process, we need not be so concerned with worrying about this sort of thing, which is not to say we should be totally gung-ho about it. There is a high probability of error in human proof (see Davis in New Directions). What about non-constructive proofs of the existence of a proof by formalising a theory within itself? Probabilistic proofs? Problem: probabilistically it is a 100% certainty that x is irrational but every number we can work with is irrational. Analogy between maths and software engineering – see de Millo, Lipton, Perlis & Tymoczko in New Directions.]

[Talk about how the vanishingly small number of propositions which are interesting compared to uninteresting ones undermines Chaitin’s argument for axiom growth in maths. Does the same thing apply to Godel’s theorem?]

[There are typically, not exceptionally, many different formalisations of the same concept in maths. See Thurston in New Directions. Incommensurability?]

[Objection, what I have described is largely social but we can imagine someone using mathematics privately. Talk about how we can come up with personal regulative ideals, regulate our own future behaviour – e.g. writing notes to oneself that nobody else could understand. Compare with computer software not defining its run time behaviour but regulating it.]

[Why is maths so successful? We can’t answer this question because to answer it is to assume that it is since we can imagine a world that any minute now will suddenly cease to be modellable. Integer sequence game, and Gardner’s update (see below). Wittgenstein quote.

Message from sci.math on integer sequence game:

Douglas Hofstadter, author of "Godel, Escher, Bach: An Eternal Golden Braid" wrote a column for Scientific American. In one of his columns he proposed a game. The object of the game was to get a group of people trying to guess what symbols the "dealer" had secretly inserted into, say, an 8x8 grid. The dealer could use any symbols he wanted, he could even make up 64 symbols at random if he wanted. The real genius was in the scoring.

You see, the dealer's maximum score came when he stumped all of the players *except one*. The minimum score came when he either stumped *all* the other players, or *none* of the other players. In other words, if the game was too easy, the dealer got a low score; if too hard, an equally low score. But if the pattern was difficult to see, but not impossible, then he/she got a high score.

I don't remember how the scoring worked - maybe somebody here can help - but it wouldn't be difficult to develop a similar scoring system.

For example - the players would start knowing nothing about the sequence - it could be natural numbers, integers, rationals, reals, complex, Gamma functions, whatever the dealer wanted. They start with a certain number of points - say, 50. Each time they ask for a hint, they lose a point. If they guess the sequence correctly, they get to keep their remaining points. If they guess incorrectly, they lose all their points. Their score for the round will be the number of points they have left, minus the average of the other players' points. (It seems obvious that some players will end up with a negative score, though I haven't checked.)

Then try basing the dealer's score on the sum of squares of the other player's scores. (Not their point counts, but their scores for the round, as described above.) If all of the other players scored the same number of points, then the dealer gets a low score - regardless of whether the problem was too easy or too hard.

That's one suggestion - it probably needs some serious tweaking to get it to work, and you might even want to write a short computer program to calculate the scores for you (to avoid slowing down the game).

But it sure beats the problem of some demented prick coming up with the "sequence" 1, 2, 3, 4, 5, 6, 7, 8, .... and then saying the next number is "(ln(17/pi)^(1/37))" :-)

This should be Martin Gardner’s card game Eleusis not Hofstadter.]

Categories: | |