There's a standard class of problem in science where we wish to maximise (or
minimise – either way, stationary

values are sought, such that
small perturbations about the given value, as input, yield negligible change in
output) some quantity while abiding by assorted constraints. Often enough,
those constraints can be expressed as demanding that some linear function maps
the inputs of the problem to some specified value – if for no other reason
than that one can often re-express the problem in terms of parameters in terms
of which the constraints are linear. It turns out that there is a quite general
method for solving problems of this class.

The standard problem of this kind involves:

- three smooth manifolds: D, of possible inputs; V, in which our optimised quantity varies; and a constraint space W; and we usually have enouch choice about how to pose the problem to make V and W linear,
- a smooth mapping (V: f |D) whose stationary values are sought,
- a smooth mapping (W: g |D) and a member w such that we are allowed only those x in D for which g(x) = w.

For f to be stationary subject to g, at some x, we require that every trajectory through x, on which g is constant, has tangent at x in the kernel of f'(x) – or, to put it another way, for any tangent t to D at x, if ∂g(x)·t is zero then ∂f(x)·t is also zero. The derivative of f at x, ∂f(x), is a linear map from D's tangents at x to V's tangents at f(x); and likewise for g. If we have a linear map h from W's tangents at w = g(x) to V's tangents at f(x), we can form h&on;∂g(x), which maps to zero everything that ∂g(x) maps to zero: and we can use the rank/kernel decomposition of ∂f(x) and ∂g(x) to infer, given that kernel(∂f(x)) subsumes kernel(∂g(x)), just such a linear h with ∂f(x) = h&on;∂g(x).

In the case where V and W are linear spaces, each serves as its own space of tangents at each of its points; and f−h&on;g has ∂f−h&on;∂g as its derivative, for fixed linear (V:h|W). It thus suffices to seek unconstrained stationary points of f−h&on;g; these shall vary with h, so we then seek which value of h yields a stationary point which g maps to w. Our constrained stationary value problem is thus reduced to a simple stationary value problem followed by a separate constraint problem; our two requirements can thus be addressed separately rather than complicating one another.

The commonest case has V = {scalars} and seeks to maximise or minimise
f. The constraints are typically expressed as several separate scalar
functions, each with a prescribed value: this amounts to taking co-ordinates in
W, which reduces h in dual(W) to a family of scalars, one per simple constraint
(i.e. co-ordinate in W); each such scalar is a component of h. In this form,
h&on;g is reduced to a weighted sum of the constraint functions. For
example, in thermodynamics: D is a
space of possible states of the system; f is an entropy (or information content)
function; our constraint functions measure such things as energy (:E|D) and
number of particles (:N|D); we are thus obliged to look for maxima and minima of
f −βE −νN …, with each constraint function scaled by
its own multiplier; the prescribed values of E, N and so on then constrain h =
[β, ν, …] (and β emerges as 1/k/T, the inverse
temperature). In this form, the weight by which each constraint function is
scaled is described as the multiplier

associated with that constraint:
since the method is credited to Lagrange, it is known as Lagrange's method of
multipliers

.