There's a standard class of problem in science where we wish to maximise (or
minimised – either way, stationary
values are sought, such that
small perturbations about the given value, as input, yield negligible change in
output) some quantity while abiding by assorted constraints. Often enough,
those constraints can be expressed as demanding that some linear function maps
the inputs of the problem to some specified value &andash; if for no other
reason than that one can often re-express the problem in terms of parameters in
terms of which the constraints are linear. It turns out that there is a quite
general method for solving problems of this class.
The standard problem of this kind involves:
For f to be stationary subject to g, at some x, we require that every trajectory through x, on which g is constant, has tangent at x in the kernel of f'(x) – or, to put it another way, for any tangent t to D at x, if ∂g(x)·t is zero then ∂f(x)·t is also zero. Now, ∂f(x) is a linear map from D's tangents at x to V's tangents at f(x); and likewise for g. If we have a linear map h from W's tangents at w = g(x) to V's tangents at f(x), we can form h&on;∂g(x), which maps to zero everything that ∂g(x) maps to zero: and we can use the rank/kernel decomposition of ∂f(x) and ∂g(x) to infer, given that kernel(∂f(x)) subsumes kernel(∂g(x)), just such a linear h with ∂f(x) = h&on;∂g(x).
In the case where D, V and W are linear spaces, each serves as its own space of tangents at each of its points; and f−h&on;g has ∂f−h&on;∂g as its derivative, for fixed linear (V:h|W). It thus suffices to seek unconstrained stationary points of f−h&on;g; these shall vary with h, so we then seek which value of h yields a stationary point which g maps to w. Our constrained stationary value problem is thus reduced to a simple stationary value problem followed by a separate constraint problem; our two requirements can thus be addressed separately rather than complicating one another.
The commonest case has V = {scalars} and seeks to maximise or minimise f.
The constraints are typically expressed as several separate scalar functions,
each with a prescribed value: this amounts to taking co-ordinates in W, which
reduces h in dual(W) to a family of scalars, one per simple constraint. In this
form, h&on;g is reduced to a weighted sum of the constraint functions. For
example, in thermodynamics: D is a space of possible states of the system; f is
an entropy (or information content) function; our constraint functions measure
such things as energy (:E|D) and number of particles (:N|D); we are thus obliged
to look for maxima and minima of f −βE −νN …, with
each constraint function scaled by its own multiplier; the prescribed values of
E, N and so on then constrain h = [β, ν, …] (and β emerges
as 1/k/T, the inverse temperature). In this form, the weight by which each
constraint function is scaled is described as the multiplier
associated
with that constraint: since the method is credited to Lagrange, it is known as
Lagrange's method of multipliers
.
Written by Eddy.