The function factorial = (: n! ←n |{naturals}) is defined inductively by
yielding 1! = 1, 2! = 2, 3! = 6, 4! = 24, 5! = 120, 6! = 720, 7! = 5040
and so on; notice that, after a slow start, it grows very rapidly. In
particular, for any scalar x, (: power(n, x) ←n |{naturals}) grows more
rapidly than factorial for n smaller than x (i.e. abs(x) > n) but, since
there is some natural larger than x and all subsequent naturals are larger yet,
factorial ultimately grows faster than (: power(n, x) ←n :). Consequently,
(: power(n, x) / n! ←n :) ultimately stops increasing and starts
decreasing; furthermore, since each successive n is bigger than the previous
one, each decrease is by a successively bigger factor n/x, so ultimately (:
power(n, x) / n! ←n :) tends to zero as n grows large. Thus factorial
grows faster than exponentially
which, in turn, is faster than any
polynomial function.
Factorial shows up in many branches of mathematics, often by way of combinatorics - the theory of combinations. When n distinct objects are considered, the number of orderings of those objects is n!, as may be seen by observing that we may chose any of the n objects first; after which we have n-1 among which to chose as second; then n-2 for third and so on until we are left with three, two and finally one choice. The same argument implies that there are n! permutations of any set of n items. If we want, instead, to chose some subset, of our n objects, with i members in the subset, we find there are n!/i!/(n-i)! such subsets: as before, we have n choices for the first item, n-1 for the second and so on down to n+1-i choices for the i-th member of our subset; but this strategy of chosing will give any particular i-member subset in each of its i! orderings, and a subset is characterised only by the members it has, not the order in which we collected them; so we must divide the number of ordered choices, n.(n-1)...(n+1-i) = n!/(n-i)! by the number of times each subset shows up in different orders, i!, giving n!/i!/(n-i)!.
There's also a function on the complex plane which turns out to coincide with factorial at integer inputs. It's most orthodox to specify a shifted version of this function, known as the Gamma function or Γ (that's the greek letter roughly corresponding to G), defined by:
for which we can show that
to which we can apply integration by parts, interpreting exp(-t) as -exp'(-t) so that the integrand is one term in the derivative of -exp(t).power(x, t), yielding
but, when x > 0, power(x, 0) = 0 makes one end of the change() zero; while the other is dominated by exp(-t) tending to zero as t grows, so is also zero, making the whole change zero; leaving
from which, by induction, we can infer that Γ(n+1) = n!.Γ(1) for each natural n. Now, Γ(1) is just integral(: exp(-t) ←t |{positives}) which is change(: -exp(-t) ←t |{positives}) = 0 - (-1) = 1, whence Γ(1) = 1 and Γ(1+n) = n! for every natural n. It thus makes sense to extend the definition of factorial to allow n! to mean Γ(1+n) even when n is not natural.
While evaluating Γ directly is generally non-trivial (though see below for good approximations) the above enables us to determine its value at positive integer inputs; from Γ(e) it's also possible to use the recurrence relation Γ(x+1) = x.Γ(x) to infer Γ's values at all integer+e inputs, for any non-integer e. In particular, this implies that we never need to compute Γ for any input with real part outside a range of width 1, e.g. from 0 to 1 or from 1 to 2.
To obtain the integer+1/2 values, consider
substitute u = √t, u.u = t, 2.u.du = dt, 2.du = dt/√t, giving
and this integral over negatives is necessarily equal to that over positives, so twice it is simply the integral over all reals;
we can now apply a standard trick for integrating Gaussians, via computing the square root of the square of the integral:
interpret x and y as Cartesian co-ordinates in two dimensions:
now switch to polar co-ordinates in two dimensions, with x = r.Cos(θ), y = r.Sin(θ), r in {positives}, θ ranging from -turn/2 to turn/2; the transformation's Jacobian is r/radian, yielding
but the integral doesn't involve θ so integrating over its range simply multiplies the inner integral by θ's range, turn; pulling out the division by radian from the Jacobian this gives a factor of turn/radian or 2.π. Meanwhile, we can substitute v = r.r, dv = 2.r.dr to simplify the inner integral:
from which we can infer exact values for Γ at all odd multiples of 1/2; (1/2)! = (√π)/2, (3/2)! = 3.(√π)/4 and so on. Interestingly, we can wind backwards as well; (n-1)! = n!/n, so (-1/2)! = √π, (-3/2)! = -2.√π, (-5/2)! = 4.(√π)/3 and so on. (Note that we couldn't do this for integers because (-1)! requires us to divide 0! by 0; indeed, as a function on the complex plane, Γ has a simple pole at each negative integer.) For (n+1/2)! we need to compute
whence
and
the last of which enables a unification of the formulae for the volumes of spheres of various dimensions.
Define a function, chose, by: chose(n,i) is the number of distinct subsets of size i in a set of size n. Thus it's the number of distinct ways of chosing i things when n are available. Since chose(n,i) is plainly zero for n<i or i<0, we effectively have a mapping (:chose|{naturals}) each output of which is (:chose(n)|1+n). Since selecting i items among n is equivalent to chosing the n-i to not select, we can infer chose(n,i) = chose(n,n-i) for all n, i. It should also be clear that chose(n,0) = 1 = chose(n,n) for all n.
Now, suppose we know chose(n,i) for all i, for some n: consider chose(1+n,i) for some i. Chose any one member, M, of the set of 1+n objects; we are thus chosing among n un-named objects and one that we've arbitrarily named M. Any sub-set of size i either omits M, in which case it's a sub-set of the n un-named objects, or includes M, in which case it's the union of {M} and a sub-set of size i-1 drawn from among the un-named objects. There are chose(n,i) ways to do the former and chose(n,i-1) to do the latter. We thus infer chose(1+n,i) = chose(n,i) +chose(n,i-1), at least when 0<i≤n.
Now, observe that chose(n,i) = n!/i!/(n-i)! at least when i = 0 and i = n; in particular, this formula correctly gives chose(n,i) for all i for n = 0. Suppose it gives the right answers for all i for some given n (e.g. 0). Now consider chose(1+n,i) for any i. If i = 0 or i = 1+n we already know the formula holds. Otherwise, 0<i≤n and we obtain
so we find that the formula correctly gives chose(1+n,i) for all i as well; since the formula does work for all i from 0 to n for n = 0, we can now induce that it works for all i from 0 to n for each natural n. Using Γ (or otherwise) we can extend chose to be a function of two complex values instead of two natural numbers.
The function chose shows up in many places. Notably, if we consider power(n,x+y), expanded as a polynomial in x and y, we observe that the term in power(i,x).power(j.y) can only have non-zero co-efficient if i+j = n; it arises from selecting x from i of the x+y factors and y from the remaining j = n-i factors; and there are just chose(n,i) ways to do this, so
For large n, log(n!) is well approximated by the formula
which is attributed to Stirling. There's a closely related approximation given by approximating log&on;factorial via:
for some natural N, (:b|N) and a, g (which also depend on N); approximations of this form are attributed to Lanczos. With N = 6, g = 5, a = 1.000000000190015 and b = [76.18009172947146, -86.50532032941677, 24.01409824083091, -1.231739572450155, 0.1208650973866179e-2, -0.5395239384953e-5], an approximation of this form can be accurate to two or three parts in a million million for modest values of z.
Apparently, there's an easy and elegant way to derive Stirling's form. I should also derive the sinc(x.π).Γ(1+x).Γ(1-x) = 1 result.
David MacKay, in
his book Information theory, Inference and Learning Algorithms
(p2) gives
a rather nice approach to deriving Stirling's formula. The Poisson distribution
for a natural-valued random variate with mean λ has probabilities for its
outcomes proportional to (: power(i, λ)/i! ←i :{naturals}); since
these sum to exp(λ), the constant of proportionalit is exp(-λ).
If we add two such random variates with means λ and μ the result has
probability distribution
which is just the Poisson distribution again, but now with parameter λ+μ. The central limit theorem says that if we sum a large number of identically-distributed random variates with finite variance then the distribution of the sum tends, as the number of terms increases, to be well approximated by the normal distribution, at least where its probabilities are significant. The Poisson distribution with parameter λ has mean and variance λ; in particular it has finite variance. Adding large numbers of Poisson distributed variables yields a Poisson distributed variable with parameter equal to the sum of the parameter-values of the distributions summed. Thus the Poisson distribution, when its parameter is large, is well approximated by the Gaussian distribution, at least where either has non-negligible probability; for example near their mean. If the Poisson distribution's parameter is λ then its mean and variance are λ. The Gaussian with mean and variance λ has probability density (: exp((x-λ).(x-λ)/2/λ) ←x :)/√(2.π.λ) so we infer, for i near λ, that exp(-λ).power(i,λ)/i! is well approximated by the integral of this distribution across the interval between i±½ – an interval of width one, in which the density's second derivative is small, so the integral is well approximated by the interval's width times the density at its middle, i.e. exp((x-λ).(x-λ)/2/λ)/√(2.π.λ). So now consider the case i = λ and we have
whence i! ≅ power(i+½,i).exp(-i).√(2.π), whence log(i!) = (i+½).log(i) -i +log(2.π)/2.
Valid CSS ? Valid HTML ? Written by Eddy.