The many worlds
 reading of quantum mechanics entertains an extremely
liberal notion of what possibilities could the universe try
 and gets away
with it by describing reality
 in terms of a probability distribution, on
those possibile
 universes, which attributes negligible probability to
huge swathes of the possibilities, notably to the ones that a less liberal
notion could declare impossible
 and yet match up with our
experiences. 
For example, consider a piece of string (in the mundane world's sense, a twisted bundle of fibres) tied between two end-points and hanging in a gravitational field. Classically, the string forms a catenary (that particular shape of curve string typically hangs in); but quantum mechanics, like at least one of the classical approaches (the Lagrangian or Hamiltonian formalism), describes this in terms of the collection of all possible arrangements of the piece of string between the end-points. Each arrangement implies a certain total energy – which will depend on how tension (or compression) varies along the string and how much of its mass is at what altitude. Classical versions of this approach then identify the arrangement with least energy; quantum mechanics, along with thermodynamically enhanced classical models, identifies a probability distribution on the (mostly lower-energy) arrangements.
 This approach, though originally devised for small well-understood
isolated
 systems, is applicable to the universe as a whole; for which,
each possibility
 is a history of the entire universe.  A probability
distribution on histories induces one, for each moment of time, on the
possible
 states of the universe at that time
; I'll call this the
instantaneous
 distribution.  The instantaneous distribution, seen as a
function of time, tells us how the universe unfolds its story. 
 The probability distribution's dependence on our experiences
 (the
equivalent of the end-points of the string, its slack lengh and the distribution
of its weight along its length) amounts to inferring a likelihood for each
possible history, given the constraint that they agree with our experience.
Fortunately, this saves us not only any need to describe universes radically
different from our own but also the terms in which they are to be expressed and
compared against our reality; the experiences from which we infer the
distribution come with a language in which to describe the different
possibilities that are compatible with experience; a different choice
of descriptive terms still meaning the same thing
 – in the sense
that which we call a rose, by any other name, would smell as sweet
– would give the same answers; we prefer terms of description which
simplify the process of relating the physics of the system to the description's
expression of our experience. 
 Each possibility is a history of the entire universe; picture it as a
thread, with time varying along the thread (and all spatial dimensions set
aside); the threads representing two histories lie close together in so far as
the histories resemble one another; those histories consistent with our data
– our experience of reality – all run close together for those
portions of their length which tell their accounts of what we've experienced,
differing from one another only in details outside our experience.  These
real
 threads thus form a bundle
, at least for a portion of their
length and, though they may spread out subsequently, their initial similarities
will tend to keep them together for the most part.  Likewise, the instantaneous
distribution is closely constrained where histories are describing our past; but
disperses thereafter. 
 Within the real
 bundle, there are possibilities which differ in
matters which we shall, as time brings us to experience the relevant portion of
our history, be able to distinguish (archetypically, which face of the die was
uppermost when it stopped rolling, after we cast it ?).  The non-deterministic
details of quantum mechanics allow things the deterministic models of classical
physics wouldn't allow, but only in ways that look like
 some threads
within the bundle re-inforcing one another; and, in so far as we can
distinguish two portions of the bundle, threads in one don't re-inforce
those in the other in this way [yielding the collapse of the
wave-function
, splitting one bundle into several]. 
 When I get to the moment in the universe's history when the die comes to
rest and I read the number on its face, I know which portion of the bundle I'm
in; meanwhile, the probability distribution I stop using at that moment (because
I've just replaced it with that of my portion of the bundle) speaks of an
alternate me in each of the other portions of the bundle who experiences a
different refined bundle.  Before that moment, I regarded no future alternate me
as real
, only potential; after that moment, I regard one of them as real,
namely the one I find myself to be, and none of the others. 
 If we're to discuss the universe from a quantum-mechanical point of view, we
have to discuss the state of the universe
 and we have to discuss matters
relating to probability densities over collections of states of the universe.
While some may find this daunting, it is actually not that hard.  So here goes.
We start by describing probability theory on a classical universe. In principle, we consider the collection of all possible states of the universe. Each of these is a complete story of the whole universe: if it describes a universe with a beginning and an end, we can say that it tells the story from start to end. We have a probability measure on this collection of universe-histories. The probability of any event happening is the measure of the subset of universe-histories in which that event happens. I can look at my observations of the world about me and ask for the probability of those happening: more usefully, I can ask for the probability that other things will happen given that those observations have happened. (Formally, the probability of A given B is the probability of A and B divided by the probablity of B: so we're really looking at the collection of universe-histories which fit my observations, asking for the probability of some other things happening within that collection and normalising sensibly.)
 So, suppose we look at our probability density on universe histories and ask
how much we can say about it as we add progressively more observations to the
givens
.  Among the histories which say that I do set up some experiment,
say I roll a die on some occasion, we can ask for the probability distribution,
given that I perform that experiment, of any particular outcome of the
experiment (understood as the number on the die when it stopped rolling).  We
can also ask for the probability distributions associated with any manner of
other physical processes and we can condition these on my performing the
experiment: equally, we can further condition them on any particular outcome of
that experiment. 
 In so far as the outcome of the experiment (the roll of the die) is
independent of some other physical quantity, the latter's distribution
conditioned on the fact that I conducted the experiment is the same as that
further conditioned on any particular outcome.  In general, we can take any
a priori
 probability distribution on a collection of states in
which I conduct the experiment and induce, from it, distributions on each
collection of states in which I conduct the experiment and obtain one particular
outcome.  Crucially, I can take two a priori distributions, average them
(with, in fact, any positive weightings), use the average as a priori
distribution and induce a distribution from it for each branch with a particular
outcome; in classical probability theory, this last will just be the
(correspondingly weighted, as appropriate) average of the distributions induced
from the original a priori distributions. 
 The interesting thing about quantum mechanics is that it talks about states
which can be thought of as superpositions
 of other states, in
much the same way that classical probability theory superposes states: but this
superposition does not evolve
 as though each of the superposed
states evolved independently.  The coefficients
 used in superposing the
states evolve interestingly in their own right: they are not constant and their
variation may depend on one another and on relationships among the states they
scale
 in the superposition.  This has the same effect as if states
describing different things were forever decaying
 into one another. 
 The collapse of the wave-function
 corresponds to splitting up a
bundle, above.  We can, usually, express our superposition in terms of any of
several collections of states: the interesting physics of observation is that
there are circumstances where some particular decomposition can be treated as
though it were a probabilistic superposition.  That is, we can take the various
summands in the superposition and see how each of them would evolve if it
(suitably normalised) were all that we had: doing so and superposing the results
of evolving the summands separately, we get the same result as when we
evolve the whole superposition, as for a classical superposition.  When our
superposition can be decomposed in this manner, the corresponding bundle
of universe-histories has split into separate portions, each of which
subsequently goes its own way – though they may come back together
subsequently. 

 Written by Eddy.
Written by Eddy.