It seems to us […] that mathematics has now reached the stage where formalisation within some particular axiomatic set theory is irrelevant even for foundational studies. It should be possible to specify conditions on a mathematical theory which would sufffice for embeddability within ZF (suplemented by additional axioms of infinity if necessary), but which do not otherwise restrict the possible constructions in that theory. Of course the conditions would apply to ZF itself, and to other possible theories that have been proposed as suitable foundations for mathematics (certain theories of categories, etc.), but would not restrict us to any particular theory. This appendix is in fact a cry for a Mathematicians' Liberation Movement !
Among the permissible kinds of construction we should have:
- Objects may be created from earlier objects in any reasonably constructive fashion.
- Equality among the created objects can be any desired equivalence relation.
[…] I hope it is clear that this proposal is not of any particular theory as an alternative to ZF (such as a theory of categories, or of numbers or games considered in this book). What is proposed is instead that we give ourselves the freedom to create arbitrary mathematical theories of these kinds, but prove a metatheorem which ensures once and for all that any such theory could be formalised in terms of any of the standard foundational theories.
The situation is analogous to the theory of vector spaces. Once upon a time these were collections of n-tuples of numbers, and the interesting theorems were those that remained invariant under linear transformations of thes numbers. Now even the initial definitions are invariant, and vector spaces are defined by axioms rather than as particular objects. However, it is proved that every vector space has a base, so that the new theory is much the same as the old. But now no particular base is distinguished, and usually arguments which use particular bases are cumbrous and inelegant compared to arguments directly in terms of the axioms.
We believe that mathematics itself can be founded in an invairant way, which would be equivalent to, but would not involve, formalisation within some theory like ZF. No particular axiomatic theory like ZF would be needed, and indeed attempts to force arbitrary theories into a single formal strait-jacket will probably continue to produce unnecessarily cumbrous and inelegant contortions.
John H. Conway, in the
Appendix to Part Zeroof On Numbers and Games.
The following is all hideously out of date (I start from relations, not functions, and have been doing so for years now) but I haven't yet finished plundering it for fragments to re-use. Review, 1998/Nov: only the rhetoric isn't done better elsewhere – and that mostly because I now don't do it (as much) elsewhere ;^>
Rather than trying to use layout-markup to mimic the weird and wonderful things mathematicians are used to being able to write, given the joyous liberty of pencil and paper, chalk on a black-board or the like, I have chosen the unorthodox discipline of writing plaintext mathematics. This means abandoning many notational constructs which are essential to standard notation: consequently, it involves such a radical re-invention of notation that I have not been afraid of departures from orthodoxy. Where practical, I have endeavoured to retain as much as possible of the spirit of familiar notations. This has been somewhat helped by taking inspiration from computer programming languages where I have abandoned some more familiar denotations. None the less, I have not been afraid to sacrifice familiarity to maintain conceptual integrity and consistency.
In designing notation, I have drawn much inspiration from the world of
computer programming. For automatic programming
(or FORmula
TRANslation) to be invented, someone had to address the problem of expressing
mathematical formulae in a medium far less expressive than the lecturer's
writing board. Once the problem had been addressed, some good solutions were
found: more (occasionally even better) have leaked out of the woodwork at a
steady rate ever since. The astute reader may well recognise echoes of
Algol 68, Ponder, Haskell and even the Unix shell in the tools that
appear on this site; and maybe even of hint of Larry Wall's rhetoric. Thus
language designers borrowed from mathematics; now a mathematician borrows from
computer languages to design a notation for mathematics; and the cycle
continues, as I have used some of my notation in designing a query language
for an object-oriented database …
In most mathematical notation, values, entities, functions, domains and any manner of other things are named by single letters (with a few exceptions, such as sin, cos, log, exp, det, trace). In order to enlarge the space of available names, writing a letter in (sufficiently) different fonts may allow it to be used as several distinguishable names. Change of case (between upper and lower) is usually also significant. Even so, one runs short of unbound names to play with – especially given the variety of symbols which have specific meanings in contexts which usually apply: π being the most universal. Many fields conflict with one another on symbols (try considering e during discourses on group theory, differential equations and electrodynamics: stop if you get confused); some use different symbols for the same thing (ask for the square root of −1 and you can get either i or j, depending who you ask).
In the world of computer language design it was recognised, early on, that restriction to single-letter names was an unbearable burden. If nothing else, code is much more maintainable if each variable's name reflects the rôle it is to play. Lacking the huge diversity of fonts, which made the single-letter name-space large enough for the job, programmers adopted multi-letter names. This made it much easier to achieve mnemonic naming: sufficiently so that some languages have thrown away even case sensitivity (so that Null, NULL, null and even nuLl are all the same name).
I propose to work with names of arbitrary length. Please ignore however
many instances of the empty name there are between anything else: they're just
part of the zero-point background of text. I shall typically use
single-letter names for dummy
variables, though even these will
sometimes be longer. Very few glossary entries will be for single-letter names
(when, eventually, I get round to putting together a glossary).
I shall use some names of form &name;
where I think the
given name would be a good name for an HTML
character entity: if you're very lucky, it will actually be an
HTML character entity and you might even see the symbol I wanted in its
place. In particular, I shall use α, β, …, ω,
Α, …, Ω for the letters of the Greek alphabet: so π is
the ratio of a circle's circumference to its diameter (in a plane). If I type
&implies; you might see ⇒ (roughly =>, a symbol commonly used to
denote logical inference) and, if you don't, you'll see the
word implies
, albeit embedded in some funny punctuation, so hopefully
you'll understand what I meant. Similarly for × (an x-shaped symbol
commonly used for multiplication, ×), ↦ (or ↦ maps to),
∈ (is in), ∀ (for all), ∃ (there exists) and so on.
Where I define, on one page, a term which I want understood on other
pages, I shall try to use words which will make sense as the terms defined; I
hope my command of English will be both broad enough and familiar to a wide
enough audience to succeed in this. It will not be uncommon for such
definitions also to introduce some standard operations on participating
entities: I shall aim to give these descriptive names. Where I chose to use
several words to make up such a name, I shall capitalise each word (except,
sometimes, the first) and stick them all together to form the name: e.g. the
operator which takes a subset of a vector space and yields its convex hull is
called convexHull. You are welcome to think of these names as being
insensitive to case and (as in Algol) gaps: I shalln't use convexhull
or convex hull
to mean anything incompatible with such a reading, and I
may well write it as ConvexHull
if ever it appears at the start of a
sentence.
One great advantage (which, clearly, I must give up) of single-letter naming is that it makes it possible for mathematicians to dispense with a symbol, ×, for multiplication: when two letters appear side-by-side, the quantities they represent are understood to be multiplied together or composed as functions (and this seldom causes any ambiguity). In really ancient pages here, I have done that when working with single-letter names. However, once longer names are involved, it becomes necessary to separate multiplicands from one another. This could simply be done with a space; however, I chose to use something more definite – a dot. This may be a full stop (that which North American anglophones call a period) or a ·, depending on how often I've seen browsers rendering the latter correctly. Note that mathematicians, having dispensed with the need for × to represent usual multiplication (such a commonplace operation, in any case, as to inspire a wish for some less visually intrusive symbol), have long since put × to good use in related (but weightier) contexts: I have kept it aside for those, hence the need for a dot in its stead !
This section is way out of date: I start from relations these days. [Since the following preamble is a bit long, I have chosen to provide readers who just want to cut to the definitions with short-cuts to the definitions of my: source/destination denotation and action denotation. ]
Standard mathematical notation uses f: A → B
(in which →
is supposed to appear as an arrow, pointing to the right) to denote f is a
function from A to B
. I'll refer to this as the source and destination
denotation
for f. This may be followed up by an action denotation
for f – i.e. a statement of what value, in B, f ascribes to a typical
member of A: either saying f(a)= some expression in a or written as f: a
↦
the expression in a (in which ↦ is supposed to be an arrow
with a little vertical bar at its start, to distinguish it from the other
arrow). To discuss functions in plaintext, I'm going to need both an action
denotation and a source and destination denotation for functions.
I prefer the second, arrow-based, action denotation over the more common
illustrative f(a)= whatever
, for a variety of reasons: most
obviously, because (here reversing the arrows, as it matches more closely the
notation I actually use) it can be used with an anonymous function: one can
use x.x ←x, the anonymous function which squares its argument, without
having to introduce a name for it. In any case there are situations where the
arrow form is substantially necessary, so I want to be able to use it,
preference or no. However, this required an arrow distinguishable from the
first; when I embarked on inventing a notation, I only, in practice, had one
arrow at my disposal, namely ->
(which I never pretended was a very
good one). I'm pleased to now have → and ← at my disposal; but, by
the time I did, I'd already settled on a notation using only one kind of
arrow, so lost interest in exploiting a greater diversity of arrows.
To cope with having only one arrow, I used a bracketed denotation for
functions, which describes the function above, f from A to B, as
(A|f:B). This said that f gives values to all members of A, and all the
values thus given lie in B. It doesn't say that every value in B gets
produced: only that B subsumes the set of values produced. This set, {f(a): a
in A}, deserves a brief denotation. Conversely, we sometimes wish to talk
about functions defined only on a subset of some domain, as (U|f:B) for
some subset, U, of A
: at least when this subset makes no other appearance,
I prefer not to need to name it. In such a case, A's relationship to f is
like that of B – only some of it takes part: likewise {f(u):u in U}'s
relationship to f is like that of U – all of it takes part.
See also: discussion of bulk actions of binary operators (e.g. summation as a bulk action derived from (pairwise) addition); and an alternative introduction to the following source/destination notation, generalising parts of it, as natural to the discussion of relations.
I've borrowed my use of |
, with a little bending, abstraction and
generalisation, from Unix, where |
denotes feeding what's come from its
left to be received by what's on its right.
I used to toy with the idea of going over to writing the value of
function, f, given argument, a, as af
or (a)f
: this fitted
better with the picture of a function labelling an arrow from where it starts
to where it ends – the function's argument appears at the left, the
function takes it rightwards (i.e. along the arrow) to whatever its image is,
whence it may well be taken onwards by a further function applied. I'm told
this order of function and argument is used by many algebraists and
continental European mathematicians: however, the f(a)
-style notation
is nearly universal among engineers, physicists and anyone else who looks to
Newton, rather than Leibnitz, as origin for the infinitesimal calculus. Since
physicists are my primarily intended audience, I resisted the urge to follow
this mirror dream; ultimately I decided, instead, to reverse the direction of
the arrows (and, consequently, the order of tokens in each of the following
denotational forms).
I define a basic bracket notation for functions by:
Wherever all three of A, f and B appear in these, any |
can be
replaced with a :
, losing only the all of
information. Where
any of A, f, B is only needed in the discourse to fill its hole in the
denotation and isn't the only thing remaining between a |
and a
preceding (
or following )
, I'll leave it out. We can always
first demote a |
to a :
in the excluded case in order to permit
an elision; the exclusion ensures that (A:f|), (|f:B) et al. remain
denotations for the sets mentioned above. Thus (A|f:) means a function, f,
from A (to wherever)
, (A|:B) means an anonymous function from A to B and
(:f:) means a function, f, with unspecified domain and range.
When introducing a function to any discourse, rather than saying given
A, B and a function f from A to B
I'll just say given (A|f:B)
,
omitting any parts redundant to the discussion and maybe qualifying the
statement with some adjectives to tell you what kind of function we're dealing
with – e.g. given linear (V|g:W)
.
I can (and will) also refer to {(A|:B)}, meaning the collection of functions from A to B (conventionally, B^{A} – B with a superscript A). Where some adjective, e.g. linear or monic, is applicable, I'll refer to the collection of such functions by thus qualifying the anonymous function inside the curly braces: e.g. {linear (A|:B)} for the collection of linear maps from A to B. One might, indeed, use {linear (|:)} as the collection of linear morphisms in some given context, and I'll use {(|:) in C} for the collection of morphisms of a category (or arrow-world) C. Likewise, {(A::B)} means the collection of functions from subsets of A to B: {smooth (M::N)} will be very important in the discussion of smooth manifolds.
For a function (A|f:B), the sets (|f:D) and (C:f|) have orthodox denotations f^{−1}(D) and f(C), respectively. It should be noted that these contain potential ambiguity: when A and B are the natural numbers and C, D are members (whence, as required, subsets) thereof, f(C) has a perfectly good meaning as the image, under f, of the (single) natural number C which might not be equal to (C:f|)= {f(i): i in C} = {f(0), …, f(C−1)}, so writing this latter also as f(C) isn't helpful ! The same reasoning applies to f^{−1}(D) when f actually has an inverse.
Having said all that, I should point out that it dates from an old form of my denotations; I have since switched directions so that (B:f:A), when it is a mapping, maps members of A to members of B.
This eliminates one of our needs for an arrow, so I only need one; I now
use ← (because my mappings take right values to left values), to denote
the action of a function on a typical argument. Thus (:x.x ←x:)
means the function which maps each value to its square
. Where
everything in the domain of some function is uniquely expressible as k(x), for
some expression k and some argument, x, that k accepts, and I have some other
expression h which will accept any such x, I'll write (: h(x) ←k(x) :)
with the implied meaning – for example, the function ({integers}: n
← 2.n :{evens}), which halves even numbers.
Now, since (B:f|A) or (B:f:A) denotes a function which is called f, there
is a sense in which (B:f:A) = f =(B:f|A) – where it's from and to are
intrinsically part of what a function is. We can use this to name a function,
for example: square = ({reals}: x.x ← x :{reals})
. We can also
use it as a way of declaring the form of denotation used for a function: by
default, (:f:) is taken to be (:f(x)←x:), but situations which use
subscripts, superscripts or other esoteric denotations can declare their
denotations using this mechanism.
My motives for adopting the discipline of plaintext mathematics combine
the playful, the pedagogic and the prosaic. Prosaicly: as
discussed below, HTML only really enables me to present
plaintext pages. Pedagogically: I believe that the learner gains much by the
use of an unfamiliar notation (and the teacher is restrained from certain
follies to do with what's obvious
to one who already knows it),
provided it is clear enough and consistently constructed: and I believe any
serious discourse should take pains to introduce and define its terms
clearly. Playfully: variety is the spice of life, let's try another notation,
just to see what happens when we do. It is also fair to say that I find much
widely-used notation to be needlessly clumsy or obfuscated: inventing a
replacement gives me the opportunity to cure this. If you find that I have
not, please be good enough to write a calm and reasoned explanation of the
problem (and maybe some pointers on what you think might fix it) and send it
to me, eddy@chaos.org.uk. I know I am
capable of improvement ;^>
It may be objected that, in using an unorthodox notation, I do a disservice to the reader: those already familiar with the subject matter meet the barrier of unfamiliar notation; while those meeting it for the first time may understand the subject but will subsequently have trouble making sense of orthodox notation, where they meet it. I will counter with the view that it does us good to vary our point of view: a change in notation shows us familiar subject matter in a new light. I also assert that the notation for any scientific field should evolve, both under pressure from within and in response to changes in the notations of other fields: and that this evolution properly requires occasional upheavals. For scientists to be able to evolve notation, without introducing inconsistencies or extraneous complexity, we must be familiar with variation in notation: otherwise, we shall neither learn the pitfalls (into some of which, I imagine, I fall) nor understand the scope for change open to us.
Adherence to notational orthodoxy also carries the peril of neglecting to explain notation: given that I want these pages to be intelligible to folk who know little of the subject matter (so, if you find something hard to follow, I wish to hear from you and understand what the problem was and [if we can identify it] what I can do about it), it is important that I check that I have explained all the notation I use. Of course, in line with my general problem finding time to edit these pages, there will remain gaps and infelicities for a while yet – partly, of course, because my notation is evolving as I go along. It is, after all, a new start: expect significant variation in anything on time-scales comparable with, or longer than, the age of that thing.
Finally, for this section, if what you want is to skim-read it all, you'll
have to wait a long time before I bother to edit the pages to cater to you. I
aim these pages at those who, like myself, read and think about each word
– the reading equivalent of chewing every morsel 32 times
. My
target audience, in short, is those who are prepared to take time over
reading, and thinking about, the text. If you won't listen to anything unless
it can be said in ten seconds, you're going to miss out on much that's
important in this world. On the other hand, I know that if I can't produce a
decent ten(ish)-word summary of a subject, I haven't quite got it tidy in my
mind: so sound-bites will happen, in due course – they just aren't among
my high priorities.
Mathematicians are used to working with pencil and paper (and other freely mobile line-drawing tools on two-dimensional surfaces), which gives great liberty as to character set and denotational jiggery-pokery (super- and sub-scripts, integral symbols and other squiggles, convoluted fractions, diagrams with arrows, graphs and so on): it should be no surprise that the notation constructed by folk used to such freedom depends on it. Existing notations are well-suited to the traditional medium; however, I am working in another medium. One can persuade a computer to draw everything needed for orthodox notations, but one has to fight with the medium to achieve these results. I prefer to let the medium itself guide how I express myself in it, so that my effort is principally expended in expressing my thoughts, not in fighting the medium. I want a notation which arises naturally from what it's easy to do with the new medium, just as orthodox notation arises naturally from what it's easy to do with the old medium.
When working on a computer and delivering via the internet, the
fundamental medium is what's known technically as an octet-stream –
i.e. a sequence of characters. In principle (thanks to Unicode) I can type
the whole universe of strange symbols from more cultures than I can count;
but, to do so in HTML, I need to type a numeric character entity
which,
when I see it in my text editor, is just some punctuation and numbers, giving
me no clue to which random symbol I've just typed. I can see plain ASCII text
as what it is; and I can make sense of the mnemonic character entities, so I
prefer to use only these.
When I see –
in my document, I know
what it stands for (a dash character – with roughly the same width as
the letter n
); and it is easy to remember what to type to obtain
a dash of this kind. I have to look up the details to know
that ℏ
means the h-bar
symbol, ℏ, which
theoretical physicists use to denote Dirac's constant; and can't see that this
is what I've typed, when I look at my document in my plain text editor –
I have to use a web browser to see what the symbol is. Indeed, when I need to
use that symbol often, I now use XHTML rather than HTML, since XHTML lets me
give the character a name that I can remember; in the file's header, I tell
XHTML that ℏ
means ℏ
so that I
can type the former in the text and, thus, be able to read what I type.
Human ingenuity has found ways to persuade octet-streams to do all manner of fancy things, so that the new medium can be manipulated via all manner of abstractions and made to behave as if it were any other medium one wishes. However, I still sit at my key-board typing – generating a sequence of characters, just like that underlying octet-stream. I can wave a mouse around and do fancy things, but I find it far clumsier than any pen on paper; and I can write faster with a key-board than with a pen. Thus – 'though computers enable us to express many other media as octet-streams – I still, as author, work in a medium consisting fundamentally of a sequence of characters. Furthermore, it is much easier for software to manipulate what I have written (whether to check what I have written in search of errors, to extract meaning from it for use in other media, or to enable a search-engine to grasp which bits of it to index) when the meaning is directly evident in the octet-stream the program sees, rather than having to be revealed by software of far greater complexity than the manipulations I actually wanted the program to perform.
In consequence of this, I have accepted the inevitability of radically different denotational methods and, consequently, sweeping changes to notation. Anyone capable of understanding the subject matter should be able to make sense of the notation I'm creating: I aim to be intelligible to a newcomer, unfamiliar with the mathematics under discussion (who, therefore, would have been learning a new notation in any case); and those who already know the mathematics should have less trouble making sense of it all than any newcomer.
[With apologies to those who speak English (as opposed to
its close relative, USAish) for calling
mathematics math
: however, USAish is the base language for HTML, not
(whatever the authors may say) English.]
In the mid 1990s, when my web-site was new and I was devising notational
forms, W3.org managed to discuss putting
support for mathematics (and other scientific notations) into HTML: however,
in the drive towards enabling HTML authors to be able to specify the
appearance of their pages (i.e. layout) in graphic detail, this attempt at
enabling folk to express serious content (via markup) fell by the
wayside. The original draft of MATH mode in HTML faded away and didn't get to
be part of the common functionality one could rely on from browsers. After
all, you can always write it in TeX and convert the dvi output to
postscript
. Pity about the lack of hypertext links, though. By the time
XML begat MathML, necessity had brought forth inventions I preferred.
In the early days of my HTML-writing, I cheerfully expected support for
MATH to be along real soon now
and began writing pages which used it,
in anticipation of the day I could get a version
of arena (or,
subsequently, Amaya) robust enough to
show me my pages. You may, consequently, find stray uses of the old MATH mode
in some of my early pages, but I began removing it even while I still
entertained the delusion that MATH-mode might go forward.
During the browser wars
of the late '90s, Microsoft
(desperate to illegally protect its application barrier to entry
against the prospect of cross-platform applications – Java's write
once run anywhere
goal) and Netscape added features
to their
browsers which were willfully incompatible (with one another and with the
W3C's specifications) and encouraged web-site authors to use
their extensions
in preference to (rather than as enhancements of) the
W3C's specifications. The resulting pages
– optimized
for one or
another of these competitors – frequently looked positively dreadful in
any other browser (whereas: good use of extensions uses the base standards to
make a page that works for all browsers, then tweaks it using extensions so as
to work better for browsers supporting the extension). To me this amounted to
the authors sneering at anyone who wasn't using the same browser as them:
which offended me, so I set out to avoid being so rude to my readers. While I
shalln't go much out of my way to cater to deficiencies of any browser (no
matter how large its market share) which fails to support the W3C's standards,
I do intend that my pages should be intelligible to any half-way decent
browser. Since MATH-mode wasn't widely supported, use of it violated that
intent, so I backed away from it.
That forced me to work in plain text: and, once I'd solved the problem of saying what I mean in this more easily-typed medium, I ceased yearning to be able to put, on my pages, something resembling the notations designed for a freely-moving hand writing on a two-dimensional continuum. Some day I might take the trouble to write versions of some of my pages using MathML – it's beginning to be supported by browsers (Opera, for example, includes support for it from version 9.50) – but it's never going to be a high priority on this site.
My attempts at using the MATH functionality I once expected are mostly
limited to one relatively intelligible piece: HTML character entities for
symbols. There might almost be some surviving uses of underscore _
and
caret ^
to delimit subscripts and superscripts respectively (though I
try to avoid subscripts and superscripts). Given that I've seen browsers
actually coping with SUB and SUP tags (for which _ and ^ were merely a
short-hand), I've abandoned the use of ^ and _ – except that I use ^ as
an antisymmetric product operator. As explained above,
under naming, I'm still using HTML character entities,
though only where I either find the entity name mnemonic or am used to
browsers decoding it correctly. Since mid-2005, I've begun using XHTML
(despite the risk that some browsers may cope poorly with this newer, although
now fairly well-established, standard) to enable pages to use mnemonic
character entities and have them displayed suitably.
In using the character entities (in HTML), I've assumed that when a
browser meets &burble; it either shows it as such or decodes it and
displays the appropriate character. Since the names are at least mnemonic,
this should be intelligible, though ugly, even on browsers which don't decode
them. So if you see something preceded by {
(left curly brace) and followed by }
(right curly brace), you'll soon learn to recognise it as
enclosed in curly braces {i.e., it's a denotation for a collection or
set}. Likewise, if you meet something saying ∈, read it as though the
words is in
appeared in its place. You'll find a fairly
full list of
the valid symbols in the W3.org tour of HTML 3 and a
full list of the ones I know in
my list of HTML Character Entities.
Post-script (2004/March): I note that XML+CSS can do a very nice job of rendering mathematics in classical notation. The MathML markup system has now been defined but I'm unimpressed with it, though there's lively debate on its merits as compared to XML+CSS and some folk are toying with the idea of an XHTML+CSS solution. The very fact that they see a need to do that speaks volumes about MathML.
Where I couldn't find an HTML MATH mode character denotation for a symbol
I wanted, I borrowed from TeX or made one up in the same spirit. I still do
this to some extent. Thus if you meet &csname;
and
recognise \csname
from TeX, please read it as the latter. In
general, the names used are fairly clear as to what they mean – so
ignorance of TeX shouldn't present much trouble; and I'll strive to always
introduce them properly. Such bogus character entities are known probably
bad HTML
and I list them below, without concerning myself over which are
borrowed from TeX and which I've invented.
implies(roughly =>, a double-shafted rightwards-pointing arrow) or ⇒
if and only iff(roughly <=>, a double-shafted double-headed arrow) or ⇔ meaning the statements to either side imply one another.
maps to(roughly |--> i.e. a right-arrow with a short vertical bar at the left end of its shaft) or ↦
o) or ∘
I'm hanging some bits off here pending plundering them for anything for which I don't have a better home, possibly ditching the rest. The first two mainly just need notational conversions, I think.