Proving Pythagoras's Theorem

There are many known proofs of Pythagoras's theorem. My favourite proof uses tessellation; but I'm biassed, as I dreamed it up for myself (and haven't yet seen any other source for it, so I entertain the delusion that it's original). I include a few more, if only to show that there are many ways to get at the same truth.

I start with my favourite proof: it only requires one diagram and, thanks to its tessellation property, it would make a deeply satisfying floor-tile. I find the next two proofs less satisfying, though they do relate tidily to algebraic expressions (here given using h as the length of the hypotenuse, the other two sides being of lengths a and c) and only depend on moving pieces around. I then move on to two more sophisticated proofs: one relies on reasoning about equality of ratios of sides of similar triangles and Euclid's shears some triangles. All are proofs by suitably general illustration: I show each construction for a particular case (here a 3:4:5 triangle) but expect its feasibility for any other right-angle triangle to be intuitively obvious from the way the construction works for the chosen example. There are many other proofs, and the result is a special case of Ptolemy's theorem, so each proof of that (without assuming Pythagoras) is also a proof of Pythagoras.

Tessellation Proof

Since this is my favourite proof, it's the one I use on my main page about this theorem (and its consequences). Still, here's my older statement of that proof, using as background a picture that shows the essence of the proof; if you can't see the picture, you can read the text; otherwise, the picture may make it hard to read the text, but then the picture should suffice. (If you can't see the full height of a blue square, narrow your window to make the text flow over more lines.)

(Black lines.) Place the two squares of the orthogonal sides side by side, with a vertex in common. The non-coincident edges at this vertex are then colinear. Translate each square along this edge towards the other through the length of its own side: then along the coincident edge through the length of the other's side. Applying both of these translations, and their reverses, to both squares repeatedly, suffices to make a tessellation by which the two squares cover the entire plane.

(Blue lines.) Draw an edge along either of the two translations given above. By construction, this is the hypotenuse of a copy of our original triangle. Complete it to a square and use this as the start-point for the obvious tessellation of the plane by a square.

It is immediately apparent that these two tessellations have the same density of repeats over the plane, which suffices to prove the result. Alternatively, examine the pieces of each cut out by a single repeat of the other. It will quickly be seen that these pieces can be reassembled into the complete repeat figure thus cut. In case you wonder whether this depends on the particular right-angle triangle used, this animation shows the variation among the full range of relative lengths of the perpendicular sides.

Small rearrangement Proof

Construct a square on the hypotenuse with the triangle inside the square. Make three further copies of the triangle, each rotated from the original about the square's centre so as to make its hypotenuse be a side of the square. These do not over-lap, though they abut one another. They enclose a (blue) central square whose side is the difference between the two perpendicular sides of the triangle.

How to move two triangles to convert hypotenuse square to two side-squares

Cut out two adjacent copies (magenta and green) of the triangle: translate each to the other side of the square, attaching its hypotenuse to that of the copy originally opposite to it. The resulting figure has one concave corner where the central square on the difference meets the longer perpendicular edge of one of the (blue) un-moved copies of the triangle: continue the edge of the central square through this corner, extending it until it meets the far edge of the figure (dashed line). The resulting line cuts the rearranged figure into two squares, whose sides are the two perpendicular sides of the triangle. With the sides named as mentioned above, we can write this algebraically as h.h = 2.a.c +(a−c).(a−c) = a.a +c.c

Details: The two non-right angles of the triangle sum to a right angle. Thus, at each corner of the hypotenuse-square, two copies of the triangle abut each other, rather than over-lapping (or leaving a gap), along their respective non-hypotenuse edges into the corner; for one its short edge, for the other its long edge. The latter over-runs the former by the difference to form one side of the small square enclosed by the four triangles.

Notice, for the two pairs of moved triangles, that the translation by which we move one triangle to the other is along the hypotenuses of the other pair of triangles. This rearrangement proof is thus equivalent to my preferred tesselation proof.

This proof dates back at least as far as 500–200 BCE in the Zhoubi Suanjing, a Chinese book of astronomical and mathematical problems and solutions.

It's also notable that (ignoring the square on the hypotenuse) the intermediate step, 2.a.c + (a−c).(a−c) = a.a +c.c, is the heart of the proof of Apollonius's theorem: given lengths a and c, the squares of their sum and difference add up to twice the sum of their squares, since 2.a.c +a.a +c.c is (a+c).(a+c). The blue rectangles plus the black square (on the difference) form the image, under a half-turn about the centre of the outer square (on the sum), of the combination of the two squares on the lengths whose sum and difference are used. Thus the two copies of a.a +c.c, one half-turned from the other, cover that a+c square while overlapping in (exactly) the a−c square.

Traditional rearrangement Proof

Construct a square on the hypotenuse with the triangle outside the square. Make three further copies of the triangle, each rotated from the original about the square's centre so as to make its hypotenuse be a side of the square. The square is thus surrounded by copies of the triangle to form a big square whose side is equal to the sum of the perpendicular sides of the triangle.

Chose two adjacent copies of the triangle: they have a vertex in common, at one end of the hypotenuse of each. Rotate each through a right angle, about the other end of its hypotenuse, in the sense which moves it into the original square on the hypotenuse. Their final positions do not overlap, though they abut along an edge: extend that edge back towards the edge of the big square on which the moved triangles' shared vertex originally lay. The resulting line cuts the portion of the big square not now covered by the triangles into two squares, whose sides are the perpendicular sides of the triangle. The close similarity between this proof and the previous is no accident; indeed, this proof embeds the previous in the diagram that proves Apollonius's theorem.

The first of this pair of diagrams is, if I remember correctly, the figure which has been found on a shard of pottery from one of the ancient civilisations of India (far predating the earliest possible date for a Greek Pythagoras), with the word for Aha! in Sanskrit written under it. The ancient Indians clearly took the second diagram for granted. This version is also considered to be the one Pythagoras used; which fits with various other aspects of his philosophy (e.g. vegetarianism, belief in reincarnation) sounding suspiciously like imports from India. (Thanks to Piers Bursill-Hall for teaching me about, and to Marc Pienaar for reminding me of, this proof ;^)

Proof by similar triangles

Drop a perpendicular from the right-angle vertex onto the hypotenuse. This splits the hypotenuse in two. Let the lengths of the other two sides be a and b; let the part of the hypotenuse on a's side of the cut have length e and that on b's side length d, with the whole hypotenuse having length h = d + e. On a's side of the perpendicular, we have a triangle with a right angle where the perpendicular meets the hypotenuse and sharing one angle with the main triangle. The other angle, this triangle's part of the original right angle between a and b, must then be equal to the angle at the other end of the hypotenuse, opposite a. This a-side triangle is similar to the main triangle, albeit via scaling and a reflection. Its hypotenuse has length a and the e-side is opposite the angle just shown equal to the one opposite a; so the e-side of the smaller triangle corresponds to the a-side of the original. Ratios of corresponding sides of similar figures are equal, so this tells us a/h (in the main triangle) is equal to e/a (in the small one), from which we can infer a.a = e.h. A similar analysis on the other side of the perpendicular gives us b.b = d.h; adding, we get a.a +b.b = e.h +d.h = (e +d).h = h.h as required.

Notice that the intermediate results, a.a = e.h and b.b = d.h, show that – as in Euclid's proof, below – extending the perpendicular through the hypotenuse to cut its square into two rectangles, the rectangle on each side of this perpendicular has the same area as the square on the associated other side.

An alternate way of presenting this proof is: take a copy of your right-angle triangle, rotated through a right angle about its right-angle corner – so that one perpendicular side of the copy overlaps with the other perpendicular side of the original – and scale it to make the overlapping edges the same length. This places the far end of the now-coincident edges together; each had one of the non-right angle corners at that end. The angle between hypotenuses of the original and copy is the sum of these two non-right angles; and that sum is itself a right angle, so the hypotenuses meet in a right angle. Back at the original and copy's right-angle corners, the other perpendicular edges of the two triangles come out of the shared right-angle corner perpendicular to the coincident edge in opposite directions, so form a single straight line when considered together. These lines thus combine to form the hypotenuse of a right-angle triangle, whose other sides (the hypotenuses of the original and copy) meet it in angles that are corners of the original and copy, so that this large right-angle triangle is in fact similar to the two of which we made it up.

Now, if the side of the original that the copy scaled to match has length a times some given unit, with the other perpendicular side of length b times that unit and the hypotenuse of length h times the unit, we can pick a unit u that's the original unit divided by b. The coincident edge's length is now a.b.u; the scaling we applied to the copy is a/b so the original and copy have hypotenuses b.h.u and a.h.u; while their sides perpendicular to the coincident one are b.b.u and a.a.u, respectively; these last two combine to make a hypotenuse (a.a +b.b).u for a triangle whose other two sides are b.h.u and a.h.u; because this is similar to the original triangles, its hypotenuse must equally be h.h.u and we infer a.a +b.b = h.h.

Notice that this also shows that the areas of similar triangles on the perpendicular edges of a right triangle add up to the area of an again similar triangle on the hypotenuse: indeed, although we normally state pythagoras's theorem in terms of squares on the three sides, it's equally true of the areas of any figure, of which similar copies are attached to the sides of the right triangle (provided the attachment is always done at a corresponding side of the similar figures, of course). Note also that we can half-turn the big triangle about its hypotenuse's centre to form a rectangle out of three copies of a right angle triangle, scaled in proportion to the three sides of the triangles.

Euclid's proof

When I read Neal Stephenson's novel Anathem, I instantly recognised that the diagram shown here was some kind of a proof of Pythagoras's theorem. I spent some time trying to work out how it proves that result, without success. I was able to show that the crossing of three lines near the middle of the triangle is no co-incidence (given that the line, of those three, out of the right angle of the triangle is perpendicular to the hypotenuse) and worked out the co-ordinates of all vertices, but I failed to see how it actually works as a proof.

Naturally, I looked on Wikipedia and, sure enough, found it there. The critical detail I'd failed to notice is that it depicts some triangles that can be strategically sheared to obtain triangles that do the important piece of proof. So first I think I should show you something important about shearing and areas.

Shears preserve area

A shear is a linear transformation of a linear space that can be specified in terms of a vector v in that space and a covector w in its dual, for which w·v = 0; the shear maps (: u +v.w·u ←u :), so its matrix is 1 +v×w. What this looks like in practice is that there's some subspace (w's kernel) that doesn't move and everything else moves, in proportion to how far it is from that subspace, in the direction v on one side and −v on the other side of the subspace. As w·v = 0, v is in w's kernel, so all movement is parallel to the unchanged subspace. In two dimensions, that means there's a line that stays still and everything else moves parallel to that line, one way on one side, the opposite way on the other, by amounts proportional to distance from the line. Here's what one does to simple rectangles, with v horizontal and w zero on the red line:

(Black outlines show the images of solid regions; dashed diagonals are the images of solid diagonals.) Now, because the shear is linear and non-singular (it doesn't change the value of w, as w(u +v.w·u) = w(u) because w(v) is zero, so you can invert it by replacing v with −v), it maps straight lines to straight lines and parallel lines to parallel lines; as the picture shows, the vertical edges all get slanted to the same extent, so end up parallel. It thus maps similar figures to similar figures.

Consider a rectangle with one edge along the unmoved (red) line: the opposite edge slides along itself, so remains along the same extended line parallel to the unmoved line. Its length doesn't change, so the portion of the original line not covered by its image has the same length as the portion of the image that extends clear of the original. The perpendicular sides start out parallel so end up parallel. At one end, the image of a perpendicular side lies entirely outside the rectangle; at the other, it lies at least initially inside (if we shear far enough, it shall cross the other perpendicular edge and emerge from the rectangle; but even so it would start inside). The perpendicular edge, its image and part of the edge that moved along itself form a triangle; the one at one end of the rectangle coverse the portion of the rectangle that's not in its image; the one at the other end covers the portion of the image that's not in the rectangle. (Each covers the relevant portion; it may just be the portion, but shearing far enough would give it a part outside what it covers.) These two triangles are just images of each other under translations through the unchanged side of the rectangle; so they have the same area. Adding one to the rectangle and the other to its image gives us the same figure (the union of rectangle and image, possibly plus a triangular chunk at their tops, if we sheared the top clear past itself), so subtracting them from that figure shows that the rectangle and its image have the same area as one another.

Because the shear is linear, the mid-point of the rectangle (the average of the position vectors of its four vertices; averaging just scales and adds, so is respected by linear maps) is mapped to that of its image; for each, this is the mid-point of its diagonal. In the rectangle, the difference between any vertex and this centre is the negation of the corresponding difference for the opposite vertex; this relation also is preserved, so the rectangle's half-turn symmetry about its mid-point leads to its image also having half-turn symmetry, about its mid-point. A diagonal of the image passes through the centre so is mapped to itself by this symmetry; and it cuts the region into two pieces, which are interchanged by the symmetry. Consequently, these two pieces are congruent, so have equal area, adding up to the area of the whole, which is equal to the area of the rectangle, which is likewise split in half by its diagonal. So each half of the image, split by its diagonal, has the same area as a matching half of the rectangle. Consequently, the area of a triangle is invariant under a shear – at least when one edge of the triangle lies on the invariant line of the shear, which is the particular case we need for Euclid's proof.

Back to the proof

So now let's get back to that diagram from Anathem, albeit now with some colouring to help me refer to parts of it. It shows the right-angle triangle surrounded by the squares on its three sides; each such square shares one side with the triangle and has an outer side, opposite it and parallel to it; I'll refer to its other two sides as outward edges of the square. Parallel to the outward edges of the square on the hypotenuse, we also have a construction line (in cyan) through the right angle of our triangle, from its vertex to the outer side of the square on the hypotenuse. This cuts the square into two pieces; the nice thing about Euclid's proof is that it, in fact, shows that this line cuts the square on the hypotenuse into two rectangles whose areas are equal to the areas of the squares on the other two sides (its blue part is equal to the blue square; its green part is equal to the green square). Let's see how it does that.

There are two yellow triangles; one shares an edge of our original right triangle with the blue square outside that edge; it also shares an outward edge of the square on the hypotenuse with the blue rectangle within that square. Its third edge then rejoins the far end of that side to the right angle we started in. If we rotate this triangle through a right angle, about its vertex opposite this last edge, we get the second yellow triangle, mapping the first edge to an outward edge of the blue square, the second to the hypotenuse and the third to a line from the triangle's vertex opposite the blue square to the blue square's outer vertex on its outward edge not through the original triangle's right angle. Since a rotation maps one to the other, these two yellow triangles have equal area.

The right angle vertex of the original triangle, that it shares with the first yellow triangle, lies on the cyan line that splits the square on the hypotenuse into two rectangles; this line is parallel to the side that joins the other two vertices of the first yellow triangle, so we can shear with that side as invarient to move the vertex along the cyan onto the hypotenuse. This doesn't change the area of the triangle and turns it into one half of the blue rectangle within the square on the hypotenuse. For the other yellow triangle, use the edge it shares with the blue square as invariant line; the opposite side of the blue square meets the blue side of our right angle triangle in a right angle, at the right angle vertex of that triangle, so is a continuation of the other side of that triangle at this vertex – which ends at the third virtex of our second yellow triangle. So we can shear this yellow triangle to move this third vertex to the right-angle vertex of the original triangle, at which point it becomes half of the blue square. Thus the blue square and the blue rectangle within the square on the hypotenuse have equal area; the halves into which their diagonals split them can be sheared onto the two yellow triangles, which have equal area.

In exactly the same way, we can cut the green rectangle in two with a diagonal (it doesn't matter which), shear the half that has the outward edge of the square as a side so as to move its other vertex along the cyan line to the right-angle vertex of our original triangle, turning the half rectangle into one of the magenta triangles, with the unchanged edge and a green edge of the original triangle as sides; a quarter turn about their common vertex turns these two edges into the hypotenuse and an outward edge of the green square, giving us the second magenta triangle. Shearing this, with its side of the green square fixed, moves the blue end of the hypotenuse to the right vertex of the original triangle and we get half of the green square. Thus the green square and green rectangle have equal area. The sum of the areas of the green and blue squares is thus equal to the sum of the areas of the green and blue rectangles; and these partition the square on the hypotenuse, so the sum is indeed this square's area.

Transforming a whole rectangle

We can in fact apply the shear, rotation and shear – that we used in Euclid's proof to transform half of the rectangle onto half of the square, without changing area – to the whole rectangle, thereby showing that we can actually transform the whole rectangle directly onto the whole square:

This can be used to take the square root of any rectangle's area, i.e. to construct a square with the same area.