]>
Statistical physics is the orthodox formalism for describing thermodynamics, the study of heat and temperature. It's statistical because it normally concerns itself with systems with quantities of matter made up of large numbers of particles (atoms, molecules, ions or whatever). This matter may be a plasma, a gas, a liquid or a solid.
For a general quantum system, given a
collection J of observables of that system and the values ({reals}: Val :J)
taken by those observables, there is a family of states of the system which are
most probable
. For any simultaneous eigenbasis of the observables in J,
these most probable states are superpositions of that eigenbasis differing only
in the phases of their components, with respect to that basis. The amplitudes
of their components can be inferred from
I shall ignore, for now, the question of whether h (or, failing that, R(Val)) must necessarily be unique.
Macroscopic systems, including nearly everything with which we normally interact, are made up of many particles of comparatively few kinds. For example, a box of air contains a mixture of gasses, roughly four fifths Nitrogen, one fifth Oxygen, with a smattering of impurities (notably the greenhouse gasses carbon dioxide and water vapour). Even fairly rare impurities, in so far as they're detectable, are present in quantities great enough to imply (at least) millions of particles of the given impurity. It thus makes sense to describe our macroscopic system as being composed of many sub-systems, each corresponding to one thing (molecule, atom or other irreducible component) present in the system, and to decompose the analysis in terms of the different kinds of thing present. It is thus convenient to consider, first, the case of a system whose parts are all of the same kind; although each part may be in a different state, each is of the same kind as the others; any state one of them could be in is a state any of the others could be in just as readilly. Such a system is described as homogeneous: all its parts are of the same kind.
If our system is made up of many identical instances of a simple sub-system, we can hope to benefit by describing the system in terms of a factorisation into many identical copies of the sub-system. If the sub-system's possible state vectors are spanned by some collection C of state vectors (e.g. when C is a basis of the states of the sub-system), we should be able to obtain a basis of the system's state space whose state vectors are of form join(f) with f a list of members of C and join being some operator which combines these. In particular, this will also enable us to define linear maps from states of the whole system by specifying their values at each join(C:f|m) with m natural and extending linearly from what this implies.
For observables of our sub-system, such as perhaps energy, we expect the observable of the whole system to simply add up the observables' values in the separate sub-systems. Thus, for observable X, we expect X(join(C:f|N)) to be sum(: σ(f(i),X(f(i))) ←i |N).join(f), with σ the inner product in the sub-system's state space; for a state c of the sub-system, σ(c,X(c)) is X's expected value in state c; which is X's eigenvalue if c is an eigenstate of X.
In particular, it is worthy of note that this is just the behaviour we would get if join were a multiplicative combinator, like the tensor product, with observables behaving as Leibniz operators, so that, for any scalar-valued operator, X(join(C:f|N)) = sum(: join(at(i,X,f)) ←i |N) with at(i,X,f) being the list (C:|N) which has X(f(i)) at position i and f(j) at each position j other than j = i. (For tensor-valued operators (U⊗C:X|C) we need to permute the U-factor in the tensor product join(at(i,X,f)) out to the front of the product, so that all terms have the same tensor rank (or, more particularly, get the X-induced extra rank U in the same position among their ranks), which makes the formula messier to write; but the behaviour works out similar to the scalar case, as if we had simply taken co-ordinates in U as distinct observables and then combined them after applying each.) When each entry in f is an eigenvector of X, this will produce the expected result given above, with join(f) being an eigenvector of X whose eigenvalue is the sum of the eigenvalues of f's entries.
Now our sub-systems are all identical, so the order of items in f should have no impact on the physical meaning of join(f); if s is a permutation of (:f|), then join(f&on;s) needs to be join(f) times a phase, otherwise its expected values for assorted observables will be different. Furthermore, that phase must depend only on s, implying that there should be some mapping ({phases}: φ |{permutations}) for which φ(s).join(f&on;s) is independent of (m|s|m), for given (C:f|m). As noted above, we can define linear maps by their actions on join(f) inputs; each (: φ(s).join(f&on;s) ← join(f) :) must then extend linearly to the identity on state vectors.
Now, if we compose φ(s).join(f&on;s) ←join(f) after the same for some other permutation r, we obtain φ(s).φ(r).join(f&on;r&on;s) which must clearly be equal to φ(r&on;s).join(f&on;r&on;s), so we can infer that φ is a group homomorphism from permutations, under composition, to phases, under multiplication.
Now, like join, the bulk action of the tensor product operator, bulk(×, [a, b, …, z]) = a×b×…×z, suffices to construct, out of bases of the spaces whose vectors it combines, a basis of the compound space; in particular, as for join, this enables us to extend linearly from values given for bulk(×, (C:f|N)) to the span of these seed inputs; so we can define a linear map by extension from
Because φ is a group homomorphism, composing any permutation's φ(r).bulk(×, f&on;r) ← bulk(×, f) with the given linear map merely rearranges the terms in the sum, leaving the same linear map. Thus
presents itself as an ideal candidate for join: it's a multiplicative combinator so, as noted above, it suffices that observables act as Leibniz operators on the resulting products. Since, in fact, this permuting product combinator has all the right properties, we can use it as a model of join, whose sole meaning is within our quantum-mechanical model, so we lose no generality in actually using it as join. It remains only to determine what φ should be. As it happens, the fact that it's a group homomorphism turns out to be a tight constraint.
Theorem: there are only two group homomorphisms ({phases}: |{permutations}).
Proof: consider such a homomorphism, φ. For permutations of the empty set or of a set with only one member, the only candidate is the identity, which φ must map to +1. In all other cases, we have at least two things to permute.
For any permutation of period m, for any natural m, φ must map the permutation to a phase which, when raised to power m, yields 1. For any two distinct inputs, i and j, there is a permutation swap(i,j) which maps i to j, j to i and leaves all other inputs unchanged. This plainly has period 2 so must be mapped to ±1. However, all permutations can be expressed as composites of such swaps; so φ must map any permutation to a product of terms drawn from {+1,−1}, hence indeed to either +1 or −1. Raising −1 to an odd power gives −1, so φ has no choice but to map every permutation with odd period to +1.
Now suppose φ(swap(i,j)) = u in {−1,+1} for some distinct inputs i and j. If these are our only inputs, this (since the identity is mapped to +1) fully determines φ. Otherwise, for any other input k, swap(i,k)&on;swap(i,j) maps i to j, j (via i) to k and k to i, leaving all other inputs alone; this is a 3-cycle, so has period 3, which is odd, so φ must map it to +1; so we can infer φ(swap(i,k)) = +1/u, which is just u. We can apply the same argument now to a fourth input, h, to infer that swap(k,h) = u also; hence that φ maps all swaps to the same value, which is either −1 or +1. Since every permutation is a composite of swaps, φ's value on swaps suffices to imply its value everywhere else; and it has only two possible values on swaps – Q.E.D.
The two possibilities for φ are the trivial homomorphism – which
maps each permutation to +1 – and the signature
homomorphism,
sign, which maps swaps to −1. (Technically, that sign is
well-defined depends on a theorem about permutations which states that no
product of an even number of swaps can be equal to any product of an odd number
of swaps.) These, in turn, generate two possibilities for join; the first
builds the completely symmetrised tensor product; while sign builds the
completely antisymmetrised tensor product.
The symmetric option allows us arbitrary lists (C:f:) as inputs to join; the antisymmetric one, however, will map to zero any (C:f:) whose outputs aren't distinct (from this we may infer that it also maps to zero any (C:f:) whose outputs aren't linearly independent). This in effect means that, if φ is sign, no two sub-systems can be in the same state (indeed, no sub-system may be in a state expressible as a linear superposition of the states of the others); the number of times any given member of C appears in any f which join doesn't map to zero is in {0,1}. As it happens, particles whose spin is an odd multiple of half Dirac's constant obey exactly this rule – no two of them can be in the same state; this is called the Pauli exclusion principle. In memory of Enrico Fermi – who, with Paul Dirac, first studied the thermodynamics of such particles – these are known as fermions. Particles whose spin is a whole multiple of Dirac's constant (so an even multiple of half that) use the symmetric form of join, allowing arbitrarily many of them to be in any given state; these are called bosons after Satyendra Nath Bose – who, with Albert Einstein, first studied their thermodynamics.
If we consider the identity on span(C) applied as a Leibniz operator on the
span of products of lists with entries in C, we obtain an observable on these
products for which each join(C:f|m) has eigenvalue m; this, then, counts the
number of sub-systems; call it N. Likewise, for each state c of the sub-system,
there's a formal observable c×σ(c) which measures the probability of
a sub-system being observed to be in state c; when applied as a Leibniz
operator, n(c), to states of the composite system, this sums this probability
over all sub-systems, yielding the expected number of sub-systems observed in
the given state. If C is a basis and c is in C, then any list (C:f:) has, as
its eigenvalue for n(c), the number of members ({c}:f|) has i.e. the number of
sub-systems in state c when the whole system is in state join(f). Of course,
because we have a quantum-mechanical system, we may have it in a state which is
a super-position of eigenstates of these number operators; this is then apt to
not be an eigenstate of them; which is why n(c) only determines an expected
value
(i.e. average) for each, in the usual way.
The outputs of (: join :{lists (C::)}) span the states of our system. Given the above Leibniz rule reading of the action of observables on these: we can, when C is a basis, treat each c in C as being an eigenstate, with eigenvalue 1, of both n(c) and N; and an eigenstate with eigenvalue 0 of all the other n(e) for e in C. Thus, as long as C is an eigenbasis of all the observables in our collection J of observeds, it is indeed consistent to entertain N and the n(c) as members of J. On this same hypothesis, define a mapping F = (: (: σ(c,X(c)) ←c :C) ←X :{observables}), which extracts the eigenvalues for the individual single sub-system states; and, using S for the metric on the composite space, define count = (: (: S(join(f), n(c,join(f))) ←c :C) ←f :{lists (C::)}), for which count(f,c) is just the number of members ({c}:f|) has. For fermions, we only consider those f for which ({0,1}: count(f) |C); for bosons, any count(f,c) can take any natural value. Let Q = {0,1} if our sub-systems are fermionic; but Q = {naturals} if they are bosonic.
We have Z = ({observables}: exp(h(J)) ←h :{distributions on J}) and Y = log&on;trace&on;Z; with trace being a summation over all members of some orthonormal basis of the space of states of the compound system. Expressing any distribution h on J in terms of the ({reals}: H :J) mapping for which h = H@sum, summation weighted by H, we have Z(h) = exp(sum(: X.H(X) ←X :J)). This is exp applied to an operator diagonal with respect to our basis {lists (C:f:): (Q: count(f) |C)}; hence Z(h) is likewise diagonal, with each diagonal entry being (the usual scalar) exp applied to the corresponding diagonal entry of log(Z(h)). Thus the diagonal entry of Z(h) for basis member join(f) is
Summing over all possible lists (C:f:) is the same as taking a sum in which each count(f,c) is allowed to range over the whole of Q. Thus:
Now, for given v, sum(: power(j, v) ←j |Q) is just 1+v for fermions or 1+v+v.v+v.v.v+… = 1/(1−v) for bosons. Either way, the sum's log is ±log(1±v), with ± being + for fermions, − for bosons. Thus we can infer
So now let's look at Y', once more interpreting h via some H for which h = H@sum = (: sum(: f(X).H(X) ←X :J) ←f :), so that
and we're selecting an h which makes this equal Val, i.e. for which, for each X in J,
In particular, we can insert X = n(c) for some c in C; F(n(c),c) = 1 and F(n(c),e) = 0 for all other e in C; so we can infer Val(n(c)) = 1/(exp(−h(F,c))±1) as the expected number of sub-systems in state c. Likewise, each c in C has F(N,c) = 1 so Val(N) = sum(: 1/(exp&on;h(−F)±1) |C). In particular, for each observable X, we obtain
as we might naïvely have expected; each observable's value in our actual state is the sum, over sub-system states, of the observable's expected value for the given state times the number of sub-systems in that state. Now, from h we are to compute R(Val) = Z(h) / trace(Z(h)), which is diagonal in our chosen basis. From our value for Y(h), we can compute
so that the diagonal entry of Z(h)/trace(Z(h)) for basis state join(f) is
and to do much more than this, we need some description of the possible states, and actual observables, of our system. (However, generally one of our observables is energy, E in J, and the corresponding H(E) turns out to be 1/k.T, where T is the temperature and k is Boltzmann's constant, k = 13.805e-24 Joules per Kelvin.)
In a system composed of a mixture of many kinds of sub-system – e.g. a mixture of many distinct kinds of molecule – we can apply the analysis above to each distinct kind of sub-system and then combine the results. Given that the above reduces description of the particles of each type to describing products, averaged (with suitable sign) over all permutations, of states of one particle, it makes sense to describe a heterogeneous system's state as a (tensor) product of terms, each of which is a state of all the particles of one kind. The various observables are then Leibniz operators, as before; when acting on products of eigenstates, they simply scale the product by the sum of the eigenvalues of the various states making up the product.
For each species, A, of sub-system we get a number operator, N(A), whose expected value (and, in eigenstates, eigenvalue) is the number of sub-systems of that type. These formal observables correspond (via Avogadro's number per mole) to the amount of each substance present, which is commonly an actually observed property of the system, so these number operators are naturally significant. We also get a mapping (: C |{species}), for which each C(A) is a basis of the states of a single sub-system of kind A. As before, for each c in any C(A), we get a number operator n(c) and each F(X) becomes a mapping ({reals}: |union(C)), wherein union(C) is implicitly a disjoint union (since distinct species yield distinct sets of states, hence bases; even if all their other parameters are equal, a in C(A) and e in C(B), with A and B distinct, describe sub-systems of distinct kinds). We can partition {species} into {bosonic species} and {fermionic species} and introduce D = union(:C:{fermionic species}) and E = union(:C:{bosonic species}), again implicitly using disjoint union. This yields:
If every exp(h(F,c)) is small, Y(h) is well approximated by sum(: exp(h(F)) |union(C)), but this seems unlikely to be realistic.

Written by Eddy.