]> Binary Operators

Binary Operators

A binary operator is a denotational form used when combining two values, particularly when the mode of combination is applicable to many more values. The form interposes a symbol, representing the binary operator, between two expressions denoting values; the combined text then denotes a value that results from combining the given values. As such, it is an expression denoting a value, so it can be combined by another binary operator with another expression denoting a value, and so on. It is binary in the sense that it combines two values at a time; but, by repeated application, it can thus combine many. The values combined by a binary operator are known as its operands.

Context, in giving meaning to a symbol as a binary operator, must indicate what values it is capable of combining and how it combines them. Where more than two operands are so combined, context may specify the order of evaluation or establish that it makes no difference, in order to resolve the ambiguity that might otherwise arise. I allow that a binary operator may be ambiguous (for some a, b there may be several values that a*b can denote) but most are in fact unambiguous, at least modulo context's relevant equivalence construed as equality. Any unresolved ambiguity in evaluation order may also contribute ambiguity, independent of any inherent in the binary operators themselves, when many values are combined.


Two primary classes of binary operator shall emerge: those I'll term categoric (e.g. proper composition of homomorphisms) may be fussy about their operands, restricting which they can combine in what order; while the more classical binary operator (e.g. arithmetic) has a set of operands and freely combines any operand with any operand, in either order (albeit the result of combining them may depend on order); I call such binary operators flat to distinguish them from categoric ones. In either case, it is usual for the composite values that arise from combining two operands to be, themselves, operands; and for the binary operator to introduce no ambiguity when combining several values. I'll now formalise these two properties, before going into the details of the two classes of binary operators.

Closure and Restriction

Given a binary operator * and a collection V of its operands, I'll describe V as closed under * precisely if V subsumes {a*b: a, b in V}. I'll describe a binary operator as, simply, closed precisely if its {operands} is closed under it.

An intersection of collections, each of which is closed under *, is itself necessarily closed under * since any a, b in it, that * knows how to combine, are necessarily in each of the collections intersected, each of which is closed so also contains a*b, which is thus a member of the intersection. For any collection V of operands of *, V's closure under * is the intersection of all collections, closed under *, that subsume V; this is a minimal collection of *'s operands that does subsume V, provided * itself is closed (to avoid intersect(empty) as the closure).

One of the ways we can represent a binary operator * is as the embedding (: (: b*a ←a :) ←b :) which represents each left operand by a relation between values and right operands. When this embedding is iso (it is always a mapping; so it is enough to show it monic) we can use it to reason about the action of our binary operator in terms of the composition on relations. When * is closed on V, the embedding is a mapping ({relations (V: :V)}: |V); given that {relations (V: :V)} is closed under composition, it is not unreasonable to hope that our embedding's collection of left values is likewise closed under composition. When it is, if the embedding is also monic and the binary operator associative (see below), the embedding serves as a faithful representation of our binary operator.

For any collection U of operands of a binary operator *, we can restrict the associated embedding and its left values on the right with U to get (: (: a*b ←b :U) ←a |U) and thereby obtain the binary operator on U whose associated embedding this is; I shall refer to this as *'s restriction to U. (In contrast, the normal restriction of relations applied to *'s associated embedding gives (: (: a*b ←b :) ←a |U), which doesn't restrict its left values; the binary operator associated with this mapping combines a in U with an arbitrary right operand b of *, rather than also restricting to right operands b in U. For this reason, I do not treat a binary operator as definitively synonymous with its associated embedding; they support distinct meanings of restrict.) Note that this does not restrict the left values of the embedding's left values, so every a*b with a, b in U is valid for this restricted binary operator, even if a*b isn't in U; thus *'s restriction to U is closed (as a binary operator) precisely if U is closed under * (hence also under its restriction to U). When U is not closed under *, restricting (: (U: a*b ←b :U) ←a :U) would give a closed binary operator; but some a*b with a, b in U would be valid for * but (due to a*b not being in U) not valid for this restriction.

Distinguishing the binary operator from the mapping that encodes it also allows me to define a distinct meaning of subsume for a binary operator. One binary operator, *, subsumes another, @, precisely if: whenever @ knows how to combine a with b, so does *; and every value @ admits for a@b is a value * admits for a*b. For unambiguous *, @ is necessarily unambiguous also; if @ is flat, in this case, it is then simply a restriction of *. However, in general, @ may be pickier than * about which operands it choses to combine. When * is unambiguous, restriction and such pickiness are the only options; but for ambiguous * we can also simply have @ combine a@b to produce only some of the values a*b can take.


I'll describe an expression that combines values using binary operators as valid precisely if each use of each binary operator has, as its operands, values the given binary operator can combine; the binary operator template actually makes the expression assert this anyway, but overtly saying it's valid makes the sentences below read more clearly – a*b is valid is only slightly clearer than * does know how to combine a and b but I encourage the interested reader to think about how the specification of associativity, that follows, would end up without the validity short-hand.

As already discussed, combining several values using binary operators has the potential to produce different answers depending on the order of evaluation. When combining values using only a single binary operator, that operator is described as associative precisely if the order of evaluation makes no difference. For a binary operator * this requires that, for any a, b, c for which either (a*b)*c or a*(b*c) is valid, both of the given expressions are valid and they agree. This is sufficient to ensure associativity (in the sense just specified), as every order of evaluation, of a given sequence of operands alternating with * as operator, can be transformed to every other by such re-arrangements of such three-way combinations.

Given a sequence of n operands alternating with repeats of a single operator, let the number of possible orders of evaluation be p(n). We clearly have p(1) = 1 = p(2). We can infer p(1+n) from (:p:1+n) by considering the outermost parenthesisation (i.e. the top-most node in the parse tree): this necessarily splits the list into two non-empty parts, whose lengths sum to 1+n, and we can parenthesise each part in any way we like; taking the length of the (non-empty) left part as 1+i for some natural i, we must have 1+i < 1+n so that the right part is non-empty, hence: p(1+n) = sum(: p(1+i).p(n−i) ←i :n), at least for n > 1. The first few are then: 1, 1, 2, 5, 14, 42, 132, 429, … and, generally, p(1+n) = (2.n).(2.n −1)…(n+2)/n/…/2/1 = (2.n)!/n!/(n+1)!; but I'm not about to try to prove that here !

I'll also say that a binary operator * associates with a binary operator @ precisely if, whenever both (a*b)@c and a*(b@c) are valid they agree and likewise for (a@b)*c and a@(b*c). Note that this is a weaker condition than associativity, as it only applies where the expressions are given to both be valid; so a binary operator * can associate with itself, in this sense, even if it isn't associative due to some (a*b)*c being valid where a*(b*c) isn't (or vice versa). When * and @ associate with one another, we can freely write a*b@c or a@b*c without parentheses as long as at least one of the given readings is valid – although including parentheses may be a kindness to the reader when only one reading is valid, to make clear which that is.

Associativity allows us to combine arbitrarily long lists of operands without introducing any ambiguity (aside from any inherent in our binary operator itself), which makes it natural to consider the binary operator in terms of its bulk action.


Closure ensures that the result of combining operands is an operand, so can be combined with other values, raising exactly the question that associativity answers; and associativity is only relevant in so far as a result of combining operands is an operand, so is at its most relevant when the binary operator is indeed closed. So I'll refer to a closed associative binary operator as a combiner; most binary operators of interest either are combiners or complement them in some way (as subtraction and division complement addition or multiplication, undoing their respective effects) that enables us to infer their action on operands from actions of the combiner from which they're derived.

Note that formal logic and functional programming also use the word combiner in a highly technical sense, for some specific operators that may or may not combine values; here, however, I'm using the word in a sense closer to its natural language meaning (that I note some Haskell programmers also use, when talking about things that combine values). A combiner combines things while abiding by certain rules that are commonly taken for granted; the above merely formalises those assumptions to facilitate precise reasoning about their consequences and their interaction with the effects of other assumptions that we shall, at least in some cases, find it natural to make.

The Great Divide

In a ski resort it is usual to have some lifts providing access to maintained runs. Sections of lift start at the ends of runs or of other lifts; each run typically starts at the end of some lift and ends at the start of another. If a run passes a start or end of a lift, we can split the run in two so that one part of it ends here and the other starts here; if a section of lift ends at a place from which there are no maintained runs, but from which just one other lift takes off, we can interpret the two as a single lift. A guide to the resort can then describe the start and end points (some of which may be the sites of other ameneties, such as restaurants) and characterise the routes, both runs and lifts, between them. The skier can then compose a route out of the runs and lifts thus described, by combining runs and lifts subject only to the constraint that each route fragment starts where the previous left off. The composition of routes around the resort can thus be construed as a combiner, and then, which only allows combination of a left operand ending in a given place with a right operand starting in the same place; the composite starts where the left operand started and ends where the right operand ends, which constrains what we can combine it with. This is in stark contrast with the addition and multiplication of everyday arithmetic on numbers, which combine any number with any number freely.

I'll describe a binary operator * as flat on some collection V precisely if a*b is valid for every a, b in V. Aside from category theory, orthodoxy mostly focuses on flat combiners. On {relations} composition, &on;, is indeed a flat combiner; but it has restrictions that aren't flat. These last are combiners on {relations}; each also has flat restrictions, e.g. clean composition is a flat combiner on {relations (V:|V)} or on {mappings (V:|V)} for any collection V. The familiar addition and multiplication of numbers are both flat. For the most part, classical binary operators are flat combiners.

I'll describe a combiner * as categoric precisely if:

The first two conditions generalise requiring an identity at each start-point and each end-point, that would make each b = e*b = b*i for some e and i, each of which serves as a surrogate for an object at which b starts or ends. The last, albeit painfully technical, expresses the idea that there are end-points and the only constraint on whether * can combine values is the need to have a common end-point in between the operands to be combined.

The composition of homomorphisms, based on clean composition of relations, serves as the archetype for category theory, which is the study of categoric binary operators.

In particular, the last categoric condition ensures that associativity does something useful for us. Suppose we have u, v, w for which u*v and v*w are both valid; the second condition above lets us write v = r*s for some r, s; this makes u*(r*s) and (r*s)*w valid, hence so are (u*r)*s and r*(s*w); applying the third condition to either of these and r*s then implies (u*r)*(s*w) is valid, for example b = u*r and a = r with c = s, d = (s*w) have a*c and b*c valid, so validity of a*d implies that of b*d; associativity then gives us u*r*s*w = u*v*w. Thus, whenever u*v is valid and v*w is valid, we also have u*(v*w) and (u*v)*w valid. This makes it meaningful to combine (arbitrarily long) lists of values, subject only to * knowing how to combine each pair of adjacent entries in the given order; this suffices to ensure * can combine the whole list; and associativity ensures the result of this bulk action doesn't depend on order of evaluation.

Further Details

Although flat combiners commonly are categoric (only the second condition above has any possibility of failing), in the given sense, it is usually clearer to describe them in terms that ignore this; the study of categoric combiners is burdened with all the constraints on which values the operator can combine, which merely gets in the way of the study of a flat combiners, where these constraints are trivial.


A homomorphism between two structures is a relation between the values involved in each which expresses the structure of one within the other. Thus a homomorphism from one binary operator, * on V, to another, @ on U, is a relation (U: f |V) for which, in so far as * knows how to combine v, w in V to obtain some value x as v*w (allowing that * may be ambiguous, so there may be more than one such x): for every u that f relates to v, t that f relates to w and s that f relates to x, @ does know how to combine u with t to get s as u@t. When f is a mapping and the binary operators are unambiguous, this simply says f(v*w) = f(v)@f(w).

When * is ambiguous, a mapping f might collapse out that ambiguity, by mapping all the values each v*w can take to the same member of U; if it does not, @ must be ambiguous to match the ambiguity in * enough to let f(v)@f(w) take all of the values that f(v*w) can take. In contrast, @ may be ambiguous even if * isn't, or may be more ambiguous than the foregoing requires when * is, provided each f(v)@f(w) can take all the values f(v*w) can; in the presence of ambiguity, but with f still a mapping, we thus require {f(v)@f(w)} to subsume {f(v*w)}, but the expressions f(v)@f(w) and f(v*w) need not entirely agree.

Equally, when f is a relation but not a mapping, what's required is still just that {f(v)@f(w)} subsumes {f(v*w)}, i.e. every member of U realisable as f(v*w), via the ambiguities in * and in f for given v and w, must also be realisable as f(v)@f(w), via the ambiguities in @ and in f.

When * and @ are unambiguous, every mapping (U: |V) that a homomorphism f subsumes is constrained by the above to be a homomorphism, which makes it more interesting to simply study the homomorphic mappings. Building up homomorphisms that are relations from these isn't necessarily trivial, but the study of mappings is far more straightforward. Consequently, it is most usual to study homomorphic mappings rather than general relations as homomorphisms.

Because binary operators potentially have a left↔right asymmetry, that's swapped around by transposing the binary operator, we can also have homomorphisms from one binary operator, *, to the transpose of another, @; I'll describe these as cohomomorphisms. Just as homomorphisms from a binary operator to itself are called automorphisms, a coautomorphism is a cohomomorphism from a binary operator to itself – i.e. a homomorphism from a binary operator to its transpose. The homomorphisms of categoric binary operators get to be known as functors between their categories; their cohomomorphisms are known as cofunctors. The identity on operands isn't a coautomorphism unless the binary operator is commutative; which, of course, makes the co- distinction largely meaningless; and allows us to specify commuting in terms of the identity being a coautomorphism.

Composing two cohomomorphisms gives a homomorphism as composite; f(g(x^y)) = f(g(y)*g(x)) = f(g(x))@f(g(y)) swaps the order then swaps it back again. In particular, coautomorphisms can be self-inverse; and just one such suffices to represent all other cohomomorphisms to or from a given binary operator as homomorphisms by composition before or after the given self-inverse coautomorphism. This tends to mean that, when a self-inverse cohomomorphism is natural to a structure-type, the theory gives a special rôle to this one cohomomorphism and thereby avoids talking about other cohomomorphisms as such.

Interaction with properties

Suppose we have a homomorphic map (U: f |V) from binary operator * on V to binary operator @ on U. The action of * in V induces a binary operator in (| f :V) that consumes outputs of f to give f's image of what * gets by combining the inputs that produced them. For the present section, I'll refer to * as the source binary operator, this image of it as the image binary operator and to @ as the host binary operator. As f is a homomorphism, the host subsumes the image (in the special sense of subsume, specific to binary operators, above). Typically, the image has all properties of the source, regardless of whether the host shares those properties; some properties of the host naturally impose themselves on the image; in some cases, this may only be possible if the source shares the property; in other cases, (: f |) can reduce the source to a binary operator with the property.

When the source is flat, so is its image; if the host isn't, this restricts what mappings can be homomorphisms. When the source is closed, so is its image; but the host might be willing (either because the source isn't flat or because the host has more ambiguity) to combine some operands of this image to produce values not in the image of f; thus (| f :V) might not be closed under the host, even though it is closed under the image. When the source is associative, so is its image, again without constraining the host, save that it at least has some operands that it combines associatively. Thus any homomorphic image of a combiner is a combiner, even if its host isn't.

It is entirely possible for the source (hence the image) to be fussy about which values it combines yet be homomorphically mapped to a host which isn't, or is less so; indeed, the composition on T-homomorphisms, for any structure-type T, forms a categoric binary operator and there's a forgetful functor that embeds it in the flat composition on general relations; the image is as fussy as the source but the host combines the homomorphism without regard to whether they morph between T-structures for which this makes sense; the host's composites then typically aren't T-homomorphisms of anything, although many of them are apt to be empty, which may be a T-homomorphism of an empty T-structure, if there is one, regardless of how wildly inappropriate this is for the homomorphisms being combined to make it. All of which is why homomorphic composition has to be rather more tightly policed than the general composition of relations.

When the host is associative, so is the image; for the homomorphism to be monic, the source must also be associative; otherwise, the homomorphism's right-relation, (U: f |), can be used to induce an associative relation from the source. Allowing that the source might not be closed, it's necessary to specify that V is its collection of operands and combinations, so that f does map each combination to something in U and duly induces an image binary operator. Bearing that in mind, closure of the host does little to constrain the image, much less the source, unless the homomorphism is onto (U| f |V), in which case closure of the host does indeed impose closure on both image and source.

When the host's restriction to the image's operands isn't the image (although it necessarily subsumes it), we can naturally pull back a pre-image of the host on the source's operands, that necessarily subsumes the source. Indeed, if we have a mapping from some collection to the operands of a host binary operator, we can use it to induce a binary operator on the mapping's inputs, for which the mapping is trivially a homomorphism to the host, although the result may be rife with ambiguities if the mapping doesn't play nicely with the host.

Common properties

As I mentioned while describing closure and associativity, we can also consider a combiner in terms of its bulk action or of the way it embeds its operands in {relations}; these model the combiner in terms of the concatenation of lists and the composition of relations, respectively. Each expresses any given property of a combiner in its own way; each thereby sheds light on different facets of the property. It proves useful to consider combiners from all three perspectives; this page and the two just linked to each cover one of these perspectives separately; but the full picture is clearest when these three perspectives are combined. Below, I define some properties applicable to binary operators generally and sketch others that, though always applicable, take on sufficiently different forms for flat and categoric combiners that I prefer to leave their formal definitions to the respective specialised discussions, rather than getting bogged down in an attempt at sufficient generality to cover both (when, even if I did, each branch would still need to explain what that meant for itself).

If we have two binary operators, * and @, we typically need to formally parenthesise sub-expressions using either, with (a*b)@c and a*(b@c) distinct, as mentioned above when discussing the possibility that they associate with one another. An alternative to associating with one another is for one to distribute over the other, in the following sense. I'll say: that * left-distributes over @ precicely if: for any a, b, c for which either a*(b@c) or (a*b)@(a*c) is valid, so is the other and they agree; that * right-distributes over @ precisely if: for any a, b, c for which (a@b)*c or (a*c)@(b*c) is valid, so is the other and they agree; and that * distributes over @ precisely if it right-distributes over @ and left-distributes over @. The familiar case of this is that multiplication distributes over addition: a.(b+c) = (a.b) +(a.c). In boolean logic, where combiners representing and and or act on truth values (taking value true or false), each distributes over the other.

One can equally characterise * distributing over @ in terms of the mappings (: k*v ←v :) and (: v*k ←v :), for fixed k, being autoprhisms of @; when * is associative, this makes (: (: u*v ←v :) ←u :) and its transpose a homomorphism and a cohomomorphism from * to automorphisms of @.

When a binary operator has one operand, e, which, when combined with another, v, produces v as value, i.e. the binary operator swallows uses of e, then e is described as an identity. When two operands combine to make an identity, those two operands are described as inverse to one another. The precise details, in each case, are complicated by questions of which other values a given e can be combined with; which is a non-issue in the flat case but crucial in the categoric case.


In general, a binary operator may care which operand is on its left and which is on its right; a*b need not be the same as b*a, nor need the validity of either even imply that of the other. For a given binary operator *, operands a and b are said to commute (with one another) precisely if a*b = b*a (for which, of course, both must be valid). In such a case, each may be said to commute with the other (under the given binary operator).

The binary operator is described as commutative or (synonymously) abelian (after Niels Henrik Abel, who did important early work on group theory and died tragically young) precisely if, whenever it knows how to combine two values, either way round, it also knows how to combine them the other way round and they commute. So * is abelian precisely if, for any a, b for which either a*b or b*a is valid, so is the other and they agree, i.e. a*b = b*a whenever either is valid.

The word commute means change with; you can exchange the operands. The term commuter, for a worker who travels daily to and from a distant work-place, originates with train companies (back when trains were a new and disruptive technology) selling return tickets to folk travelling from out of town to look for work in big cities (e.g. Manchester), on a deal that let the traveller commute (i.e. exchange) the return half for a discount on a season ticket if (typically because they found work) they chose to do so before returning home. So, originally, the verb commute meant the act of exchanging the return half of the train ticket for a discount on a season ticket, that the worker did once at the start of a period of employment; 'though it has now come to mean what workers do every day, to get to and from work, regardless of their mode of transport.

Even when a binary operator isn't abelian, some of its operands may commute with all others. If the binary operator is a combiner, the collection of such operands is closed under the binary operator (for if a*x = x*a for all x and b*x = x*b for all x, then (a*b)*x = a*(b*x) = a*(x*b) = (a*x)*b = (x*a)*b = x*(a*b) also for all x), so the restriction of the combiner to such operands is an abelian sub-combiner. Its collection of values is orthodoxly known as the centre of the combiner; I shall also use that term to refer to the sub-combiner.

If a categoric combiner is abelian, its values can necessarily be decomposed into a disjoint union of collections of values, on each of which the combiner is flat, for which the combiner does not know how to combine values from disjoint collections, just as addition can combine lengths, or combine masses, or combine pure numbers, but cannot combine values of distinct kinds. (We may think of such binary operators as piecewise flat, even when they are not abelian or even combiners.) This may be shown by considering the relation r that relates a to b precisely if the combiner knows how to combine a with b; as the combiner is abelian it, in that case, also knows how to combine b with a, so r is symmetric. Furthermore, if r relates a to b and b to c, then the combiner knows how to combine (a*b)*c and a*(b*c) which, furthermore, are equal; but it is commutative, so a*b = b*a and, as the combiner can combine this with c, combining (b*a)*c implies it can also combine b*(a*c) whence, in particular, it can combine a with c; thus r relates a to c and is transitive. Transitive symmetric relations are equivalences (each on its collection of values, on which it is necessarily reflexive), so our combiner is flat on each equivalence class of r.

Valid CSSValid XHTML 1.1 Written by Eddy.