Constructive reasoning

As I've discussed elsewhere, I prefer constructive reasoning over use of reductio ad absurdum, a.k.a. argument by contradiction. My discomfort about that mode of reasoning, at its sharpest, centres on doubts about its trustworthiness as a way of arriving at truth, which are intimately related to the fact that double negation isn't an identity (although it may well be a projection; indeed triple negation may be the same as single negation). It should thus be no surprise that, just as our minds are poor at handling double negation (partly because some languages use it as an emphatic negation, rather than an affirmation), the arguments that flow from reductio are harder to follow than their constructive counterparts, where the same result can be shown positively. Admittedly, the reductio argument may play out faster than the constructive one, which does aid cognition, so sometimes, particularly when short, reductio arguments are clearer than their constructive counterparts, at least until you think of a comparably terse one.

Comparing complexities or comprehensibilities

Pause to consider a scatter-plot with two dimensions; one is the complexity of a constructive proof of a result, the other the complexity of proof when reductio is permitted; each point plotted is then a theorem, marked at its values for the two parameters of it we're using as dimensions. There may be simple results that flow so directly constructively that the tersest argument open to those willing to use reductio doesn't actually exercise it; they are, after all, allowed to use constructive proofs. Such theorems thus lie on the diagonal line, in our scatter-plot, along which the two co-ordinates are equal; and all points on our plot, that don't lie on the diagonal, are on the same side of it, namely towards the constructive proof is more complex side of the diagonal. So reparameterise our space by one co-ordinate along the diagonal (roughly the average of the two originals) and another (roughly the half difference of the originals) to the more constructive is complex side of it.

We can, indeed, include each theorem at a diversity of points, one for each pair of proofs, allowing or fobidding reductio, of that result. Since we can chose independently between those, this will actually mark each theorem at a lattice of points, the Cartesian product of two sets of values, the sets of complexities of known proofs, respectively allowing or forbidding reductio, of the given result. Of these two sets, the constructive one is strictly contained in the other; so the least upper bound of the constructive set is necessarily higher than that of the other; so there is a pair of proofs that give our result a point, on the the diagonal or to its constructive-is-complex side, that all other proofs are no less complex than in either set. (I'm here quietly ignoring that our complexity measure might be expressed in terms of our conceptual vocabulary; reductio's is richer, i.e. more complex, so may be penalised by it. This would give a constructive proof a higher complexity score when considered as a potentially-reductio proof; but I simplistically suppose a reparameterisation of the axes would get you back to the picture I describe.) Our whole scatter plot is limited to the region with both original parameters positive; the region in which there are valid points representing a given theorem is the image of translating this positive quadrant to put its zero corner at one point of the lattice; as such, this point is the only interesting one on our plot, so it's the one we want in order to faithfully represent that theorem.

However, we do sometimes think up new proofs of a result, that give it a new lower bound in one of the two original parameters, so it's worth remembering that the point we're actually using to represent our theorem, in a practical realisation of the plot, might not actually be the ideal one that we can imagine we would have, if the whole space of possible proofs were trawled systematicslly; any point in the plot might get moved. The true representation of the theorem is a pair of sets of co-ordinates, one of which subsumes the other; for some theorems, the constructive sub-set is actually empty, so this representation doesn't get to put any points on the scatter plot; but, for theorems which are amenable to constructive proof, the pair of sets of complexity values gives rise to a lattice in the plot that we can thus characterise by a bottom left corner point. We have a finite sample of the set of possible proofs, among which we've tended to look for simple ones, so we reasonably hope the best we've come up with give a bottom left corner tolerably close to where the ideal point for the given theorem belongs. At the very least, it seems reasonable to look at the scatter plot as a whole – even allowing that many of the points on it may be a bit further from the positive-co-ordinate axes than they should be – and suppose it is a somewhat fair guide to the shape of the plot that the ideal point for each theorem would give us; its points may each be shifted a little in some constructive ≥ reductio > zero complexity direction, but we can assume this only slightly changes the shape of the cloud in the scatter plot.

The point we're using to represent each theorem's lattice is, in any case, the one with lowest value for the average of the two original complexity co-ordinates, within the lattice; selecting this point constrains the half-difference co-ordinate, even though the lattice has other points (with higher average co-ordinate) at half-difference values from zero up to at least the value at this point (and typically well beyond that). The set of points to which we can possibly ever revise our point for a given theorem lies in a rectangle whose top right corner is the point we arere using to represent it and whose bottom left corner is thea ideal point by which we might wish we could represent it. The hope is that, for most points, this rectangle is small, so the disturbance to the observed distribution, that we would get by correcting to use ideal representations for all theorems, should not be too major. One theorem is more constructive-is-complex than another if the former's simplest proof allowing reductio beats its simplest constructive proof by a wider margin than arises for the latter.

For some simple results, reductio provides a very quick way to convince yourself of a result, that takes significantly more labour to prove without it. Some results require a more complex argument even when reductio is used; we not uncommonly need more complex constructive arguments to prove these. Sometimes, the complexity of uses of reductio can unravel to expose a constructive proof with significantly reduced constructive-is-complex score; for example, when one proves a positive directly by a simple transformation of the argument that the reductio argument used to prove some double-negative. Otherwise, among theorems with a constructive proof (which some results provable with reductio indeed won't have; they're the ones constructivists forswear and don't appear in our scatter-plot), the saving in complexity that reductio can give (i.e. the half-difference parameter) don't grow as fast as the average; i.e. the scatter-plot lies mostly close to its diagonal bound.

Now, the high cognitive cost of double negatives and, kindred to it, of understanding reductio proofs, though possibly ignored by our complexity measure, grows with the extent to which reductio is used in the proof; so a terser proof using reductio in a complex way may carry a higher cognitive load than a carefully-chosen longer constructive argument. So if we were to change from some information-theoretic complexity measure on our proofs (as I've tacitly assumed above) to some measure of cognitive load, I am fairly confident the scatter-plot moves closer to the diagonal. Which is all to say: constructive arguments can be easier to understand and, even when using reductio, limiting use of it helps treduce cognitive load.

Illustration

I happened on a set of logic puzzles in which I see the given answers using reductio and can see ways to do without it. So let's compare, and see how much reductio saves us in brevity (a tolerable surrogate for simplicity) and think about how its use affects cognitive load. First, though, I encourage you to follow that link and enjoy some better writing than mine; I'm going to select my examples, illustrating how to avoid using reductio, rather than dealing with every example.

The puzzles involve a population partitioned into two sets; for each set, its members make true statements to one another but false statements to members of the other set. Calling the latter lying is in some sense meaningless, because they all have perfect knowledge of who belongs in which set, so what a member of the other set says to you, you can simply invert to get a true statement; and the speaker thereby conveys a truth to the listener (albeit by speaking its opposite). This isn't really lying (the speaker can't plausibly expect to deceive the listener); furthermore, neither party gains any advantage over the other by it (precisely because they never succeed in deceiving one another). True liars tell the truth often enough that you believe them even when they don't. However, the formalism here is one in which all parties have perfect knowledge of the subject matter of what any of them say, so none of their utterances serve any purpose among themselves – the only effect is upon the outsider, looking in, that we are invited to be as we study the puzzles. The description in terms of opposing groups lying to one another is merely a motivation for considering a formal system of rules and exploring its consequences.

The first two explore limits on the system; an utterance from which we can infer nothing; and a utterance that can't arise (i.e. there is no way, within the formal system, for the situation described to arise). An typical reductio argument leads, from one branch of a choice, into such an impossible situations so as to eliminate that branch of the choice; by a process of elimination, this is used to argue for believing a surviving branch, from which to arrive constructively at some conclusion. The next two examples show (constructively) how one can infer a simple fact from a simple utterance (and the inferred fact, though not the statement uttered, is derived from it in a straightforward way). Each does it by showing that both branches of a choice lead to the same conclusion, so the conclusion must be true. Example 4 shows that if one claims to be of a given party, the person to whom the claim is made is in fact of that party.

We now come to example 5: one says to another at least one of us is a democrat, from which we can infer that they're both democrats. The given proof partitions the possible affiliations of the two parties into both republican, one of each or both democrat; it shows how each of the first two choices leads to the given utterance not being spoken, so discards them and leaves us with the third; this is a classic reductio proof. So instead look at a different way of reasoning from a simplification of the same fork: either the two are of one party, in which case the utterance is true so one of them is a democrat, or the two are of different parties, in which case (without looking at all at the utterance) one of them is a democrat. This is a constructive step (both sides of a branch lead to a single conclusion); and its conclusion is the utterance, which we thus know to be true, so we infer that the two are of the same party, with one a democrat, hence both are democrats. The fact that this second step combines with one branch of the first step to produce a contradiction is ignored in this argument; we constructively infer an intermediate fact, from which we then infer our conclusion. It tells us that one of the branches we proved, in considering it, doesn't actually arise; but we don't need that fact to reason from the branch to the conclusion. So now compare the two arguments:

It takes a while (at least for one raised on reductio) to get used to ignoring that the conclusion makes part of the reasoning by which we reached it appear redundant; but it does so after the fact and needed that reasoning to reach the fact that makes that reasoning appear redundant. Exploring the fact that some false statement leads to a conclusion doesn't harm the truth of that conclusion; what matters is that some true fact does lead to the conclusion; and if, until we have the conclusion, we don't know some statement is false, we may need to reason from that statement in order to establish our conclusion. Anyway: constructive 50 words, reductio 31, saving of 19, which is 38%. Form your own opinions about the cognitive load.

Example 6 has one telling another At most one of us is a Democrat; this is equivalent to At least one of us is a Republican, so we can apply the preceding result (with the two names swapped) to conclude that both are in fact Republicans. So much for simple two-party puzzles.

In example 7 A tells B C is from my party, B tells C A is not from my party and C tells A I am a Democrat. Example 4 has shown that the last tells us that A is a Democrat. The source uses one reductio step and one constructive step to conclude that the other two are Republicans; contrast that with

In example 8, A tells B C is a Democrat, C tells A You, B and I are not all Democrats. The given reasoning (with a typo in its first inference, but a correct conclusion) uses two reductio steps: compare to

In example 9, each of A through X tells the next (in alphabetic order) Z is from my party, ending with X telling Y this; Y tells Z A is not from my party and Z tells A I am a Democrat. Now, Z is from my party is equivalent to I am from Z's party so will only be spoken to someone of Z's party (by example 4's reasoning); thus B through Z are all of the same party, Y tells the truth to Z, so A is of the other party. Z's claim to A tells us A is a Democrat, so all the others are Republicans (and, indeed, tell the truth to each other, while Z lies to A, as A lied to B).

In example ten the parties are labelled by numbers instead of letters: each labelled by an odd number less than some natural N says to the next-numbered party (with an even number) N is from my party; while every party labelled by an even number less than N says to the next-numbered party (with an odd number > 1) N is not from my party; and N tells 1 Between you and the person labelled 2, at least one of you is a Democrat. Now the first two utterances are equivalent to I'm of N's party and I'm not of N's party and (by example 4's logic) we conclude that the thing it claims about I is in fact true of whoever it's addressed to. So the even numbers up to (and possibly including) N are of N's party and the odd numbers greater than 1 (and also possibly including N) are not of N's party; since N necessarily is of N's party, it's actually not one of the latter, so N is even. N's statement to 1 is now equivalent to at least one of you and I is a Democrat, which is example 5; so the parties with odd numbers greater than 1 are Republican, while 1 and the evens, including N, are Democrats.

Now, I'm not sure how solidly constructive my arguments are, but they do at least avoid overt use of reductio. They replace it with reasoning from both branches of an or to a common result, from which we then reach our conclusions. The validity of this depends on the or being true; that is, starting from an A or B that's true. The habits of reductio take every statement A to nave a negation, ~A, for which A or ~A is presumed true (this assumes A is decidable) and A and ~A is presumed false (i.e. A is not a witness to our system being inconsistent). Thus, with those habits, one reasons from either of these assumed statements; but one may also reason from some other or statements given by one's system, for example every islander is either a Democrat or a Republican; and we may need to supplement this with some other statements given by the system, such as no islander is both a Republican and a Democrat; albeit we might restate these two as the islanders are partitioned into two disjoint sets, one of Republicans, the other of Democrats. Either way, as long as the system gives us an or statement, anything we can infer from both branches of it must be true.

… all of which lead to me thinking about how to describe logic in terms of relations; specifically, indeed, the relation implies among statements in a system of formal logic. I eventually moved that discussion to a page of its own, since it's not so much a thought of the moment as an idea I've been mulling for some time..