As I've discussed elsewhere, I prefer constructive reasoning over use of reductio ad absurdum, a.k.a. argument by contradiction. My discomfort about that mode of reasoning, at its sharpest, centres on doubts about its trustworthiness as a way of arriving at truth, which are intimately related to the fact that double negation isn't an identity (although it may well be a projection; indeed triple negation may be the same as single negation). It should thus be no surprise that, just as our minds are poor at handling double negation (partly because some languages use it as an emphatic negation, rather than an affirmation), the arguments that flow from reductio are harder to follow than their constructive counterparts, where the same result can be shown positively. Admittedly, the reductio argument may play out faster than the constructive one, which does aid cognition, so sometimes, particularly when short, reductio arguments are clearer than their constructive counterparts, at least until you think of a comparably terse one.
Pause to consider a scatter-plot with two dimensions; one is the complexity
of a constructive proof of a result, the other the complexity of proof
when reductio is permitted; each point plotted is then a theorem, marked
at its values for the two parameters of it we're using as dimensions. There may
be simple results that flow so directly constructively that the tersest argument
open to those willing to use reductio doesn't actually exercise it; they
are, after all, allowed to use constructive proofs. Such theorems thus lie on
the diagonal line, in our scatter-plot, along which the two co-ordinates are
equal; and all points on our plot, that don't lie on the diagonal, are on the
same side of it, namely towards the constructive proof is more complex
side of the diagonal. So reparameterise our space by one co-ordinate along the
diagonal (roughly the average of the two originals) and another (roughly the
half difference of the originals) to the more constructive is complex
side of it.
We can, indeed, include each theorem at a diversity of points, one for each pair of proofs, allowing or fobidding reductio, of that result. Since we can chose independently between those, this will actually mark each theorem at a lattice of points, the Cartesian product of two sets of values, the sets of complexities of known proofs, respectively allowing or forbidding reductio, of the given result. Of these two sets, the constructive one is strictly contained in the other; so the least upper bound of the constructive set is necessarily higher than that of the other; so there is a pair of proofs that give our result a point, on the the diagonal or to its constructive-is-complex side, that all other proofs are no less complex than in either set. (I'm here quietly ignoring that our complexity measure might be expressed in terms of our conceptual vocabulary; reductio's is richer, i.e. more complex, so may be penalised by it. This would give a constructive proof a higher complexity score when considered as a potentially-reductio proof; but I simplistically suppose a reparameterisation of the axes would get you back to the picture I describe.) Our whole scatter plot is limited to the region with both original parameters positive; the region in which there are valid points representing a given theorem is the image of translating this positive quadrant to put its zero corner at one point of the lattice; as such, this point is the only interesting one on our plot, so it's the one we want in order to faithfully represent that theorem.
However, we do sometimes think up new proofs of a result, that give it a new
lower bound in one of the two original parameters, so it's worth remembering
that the point we're actually using to represent our theorem, in a practical
realisation of the plot, might not actually be the ideal one that we can imagine
we would have, if the whole space of possible proofs were trawled
systematicslly; any point in the plot might get moved. The true representation
of the theorem is a pair of sets of co-ordinates, one of which subsumes the
other; for some theorems, the constructive sub-set is actually empty, so this
representation doesn't get to put any points on the scatter plot; but, for
theorems which are amenable to constructive proof, the pair of sets of
complexity values gives rise to a lattice in the plot that we can thus
characterise by a bottom left
corner point. We have a finite sample of
the set of possible proofs, among which we've tended to look for simple ones, so
we reasonably hope the best we've come up with give a bottom left corner
tolerably close to where the ideal point for the given theorem belongs. At the
very least, it seems reasonable to look at the scatter plot as a whole –
even allowing that many of the points on it may be a bit further from the
positive-co-ordinate axes than they should be – and suppose it is a
somewhat fair guide to the shape of the plot that the ideal point for each
theorem would give us; its points may each be shifted a little in
some constructive ≥ reductio > zero complexity
direction,
but we can assume this only slightly changes the shape of the cloud in the
scatter plot.
The point we're using to represent each theorem's lattice is, in any case,
the one with lowest value for the average of the two original complexity
co-ordinates, within the lattice; selecting this point constrains the
half-difference co-ordinate, even though the lattice has other points (with
higher average co-ordinate) at half-difference values from zero up to at least
the value at this point (and typically well beyond that). The set of points to
which we can possibly ever revise our point for a given theorem lies in a
rectangle whose top right corner is the point we arere using to represent it and
whose bottom left corner is thea ideal point by which we might wish we could
represent it. The hope is that, for most points, this rectangle is small, so
the disturbance to the observed distribution, that we would get by correcting to
use ideal representations for all theorems, should not be too major. One
theorem is more constructive-is-complex
than another if the former's
simplest proof allowing reductio beats its simplest constructive proof by
a wider margin than arises for the latter.
For some simple results, reductio provides a very quick way to
convince yourself of a result, that takes significantly more labour to prove
without it. Some results require a more complex argument even
when reductio is used; we not uncommonly need more complex constructive
arguments to prove these. Sometimes, the complexity of uses of reductio
can unravel to expose a constructive proof with significantly
reduced constructive-is-complex
score; for example, when one proves a
positive directly by a simple transformation of the argument that
the reductio argument used to prove some double-negative. Otherwise,
among theorems with a constructive proof (which some results provable
with reductio indeed won't have; they're the ones constructivists
forswear and don't appear in our scatter-plot), the saving in complexity
that reductio can give (i.e. the half-difference parameter) don't grow as
fast as the average; i.e. the scatter-plot lies mostly close to its diagonal
bound.
Now, the high cognitive cost of double negatives and, kindred to it, of understanding reductio proofs, though possibly ignored by our complexity measure, grows with the extent to which reductio is used in the proof; so a terser proof using reductio in a complex way may carry a higher cognitive load than a carefully-chosen longer constructive argument. So if we were to change from some information-theoretic complexity measure on our proofs (as I've tacitly assumed above) to some measure of cognitive load, I am fairly confident the scatter-plot moves closer to the diagonal. Which is all to say: constructive arguments can be easier to understand and, even when using reductio, limiting use of it helps treduce cognitive load.
I happened on a set of logic puzzles in which I see the given answers using reductio and can see ways to do without it. So let's compare, and see how much reductio saves us in brevity (a tolerable surrogate for simplicity) and think about how its use affects cognitive load. First, though, I encourage you to follow that link and enjoy some better writing than mine; I'm going to select my examples, illustrating how to avoid using reductio, rather than dealing with every example.
The puzzles involve a population partitioned into two sets;
for each set, its members make true statements to one another but false
statements to members of the other set. Calling the latter lying is in some
sense meaningless, because they all have perfect knowledge of who belongs in
which set, so what a member of the other set says to you, you can simply invert
to get a true statement; and the speaker thereby conveys a truth to the listener
(albeit by speaking its opposite). This isn't really lying
(the speaker
can't plausibly expect to deceive the listener); furthermore, neither party
gains any advantage over the other by it (precisely because they never succeed
in deceiving one another). True liars tell the truth often enough that you
believe them even when they don't. However, the formalism here is one in which
all parties have perfect knowledge of the subject matter of what any of them
say, so none of their utterances serve any purpose among themselves – the
only effect is upon the outsider, looking in, that we are invited to be as we
study the puzzles. The description in terms of opposing groups lying to one
another is merely a motivation for considering a formal system of rules and
exploring its consequences.
The first two explore limits on the system; an utterance from which we can infer nothing; and a utterance that can't arise (i.e. there is no way, within the formal system, for the situation described to arise). An typical reductio argument leads, from one branch of a choice, into such an impossible situations so as to eliminate that branch of the choice; by a process of elimination, this is used to argue for believing a surviving branch, from which to arrive constructively at some conclusion. The next two examples show (constructively) how one can infer a simple fact from a simple utterance (and the inferred fact, though not the statement uttered, is derived from it in a straightforward way). Each does it by showing that both branches of a choice lead to the same conclusion, so the conclusion must be true. Example 4 shows that if one claims to be of a given party, the person to whom the claim is made is in fact of that party.
We now come to example 5: one says to
another at least one of us is a democrat
, from which we can infer that
they're both democrats. The given proof partitions the possible affiliations of
the two parties into both republican, one of each or both democrat; it shows how
each of the first two choices leads to the given utterance not being spoken, so
discards them and leaves us with the third; this is a classic reductio
proof. So instead look at a different way of reasoning from a simplification of
the same fork: either the two are of one party, in which case the utterance is
true so one of them is a democrat, or the two are of different parties, in which
case (without looking at all at the utterance) one of them is a democrat. This
is a constructive step (both sides of a branch lead to a single conclusion); and
its conclusion is the utterance, which we thus know to be true, so we infer that
the two are of the same party, with one a democrat, hence both are democrats.
The fact that this second step combines with one branch of the first step to
produce a contradiction is ignored in this argument; we constructively infer an
intermediate fact, from which we then infer our conclusion. It tells us that
one of the branches we proved, in considering it, doesn't actually arise; but
we don't need that fact to reason from the branch to the conclusion. So now
compare the two arguments:
It takes a while (at least for one raised on reductio) to get used to ignoring that the conclusion makes part of the reasoning by which we reached it appear redundant; but it does so after the fact and needed that reasoning to reach the fact that makes that reasoning appear redundant. Exploring the fact that some false statement leads to a conclusion doesn't harm the truth of that conclusion; what matters is that some true fact does lead to the conclusion; and if, until we have the conclusion, we don't know some statement is false, we may need to reason from that statement in order to establish our conclusion. Anyway: constructive 50 words, reductio 31, saving of 19, which is 38%. Form your own opinions about the cognitive load.
Example 6 has one telling another At most one of us is a Democrat
;
this is equivalent to At least one of us is a Republican
, so we can apply
the preceding result (with the two names swapped) to conclude that both are in
fact Republicans. So much for simple two-party puzzles.
In example 7 A tells B C is from my
party
, B tells C A is not from my party
and C tells A I am a
Democrat
. Example 4 has shown that the last tells us that A is a Democrat.
The source uses one reductio step and one constructive step to conclude
that the other two are Republicans; contrast that with
In example 8, A tells B C is a
Democrat
, C tells A You, B and I are not all Democrats
. The given
reasoning (with a typo in its first inference, but a correct conclusion) uses
two reductio steps: compare to
In example 9, each of A through X tells the
next (in alphabetic order) Z is from my party
, ending with X telling Y
this; Y tells Z A is not from my party
and Z tells A I am a
Democrat
. Now, Z is from my party
is equivalent to I am from Z's
party
so will only be spoken to someone of Z's party (by example
4's reasoning); thus B through Z are all of the same party, Y tells the truth to
Z, so A is of the other party. Z's claim to A tells us A is a Democrat, so all
the others are Republicans (and, indeed, tell the truth to each other, while Z
lies to A, as A lied to B).
In example ten the parties are labelled by
numbers instead of letters: each labelled by an odd number less than some
natural N says to the next-numbered party (with an even number) N is from my
party
; while every party labelled by an even number less than N says to the
next-numbered party (with an odd number > 1) N is not from my
party
; and N tells 1 Between you and the person labelled 2, at least one
of you is a Democrat
. Now the first two utterances are equivalent to I'm
of N's party
and I'm not of N's party
and (by example 4's logic) we
conclude that the thing it claims about I
is in fact true of whoever it's
addressed to. So the even numbers up to (and possibly including) N are of N's
party and the odd numbers greater than 1 (and also possibly including N) are not
of N's party; since N necessarily is of N's party, it's actually not one of the
latter, so N is even. N's statement to 1 is now equivalent to at least one
of you and I is a Democrat
, which is example 5; so the parties with odd
numbers greater than 1 are Republican, while 1 and the evens, including N, are
Democrats.
Now, I'm not sure how solidly constructive my arguments are, but they do at
least avoid overt use of reductio. They replace it with reasoning from
both branches of an or
to a common result, from which we then reach our
conclusions. The validity of this depends on the or
being true; that is,
starting from an A or B
that's true. The habits of reductio take
every statement A to nave a negation, ~A, for which A or ~A
is presumed
true (this assumes A is decidable) and A and ~A
is presumed
false (i.e. A is not a witness to our system being inconsistent).
Thus, with those habits, one reasons from either of these assumed statements;
but one may also reason from some other or
statements given by one's
system, for example every islander is either a Democrat or a Republican
;
and we may need to supplement this with some other statements given by the
system, such as no islander is both a Republican and a Democrat
; albeit
we might restate these two as the islanders are partitioned into two disjoint
sets, one of Republicans, the other of Democrats.
Either way, as long as
the system gives us an or
statement, anything we can infer from both
branches of it must be true.
… all of which lead to me thinking about how to describe logic in
terms of relations; specifically, indeed, the relation implies
among
statements in a system of formal logic. I eventually moved that discussion
to a page of its own, since it's not
so much a thought of the moment as an idea I've been mulling
for some time..