A non-vectorial observable: position

Observables in quantum mechanics are generally treated on the premise that the values they take (their eigenvalues) are quantities of such a kind as may be scaled and added: that is, tensor, vector or scalar quantities. However, General Relativity requires us to describe the universe as a smooth manifold, in which context position is not a vector quantity: at best, one can represent positions (in modest-sized chunks of space-time) by vectors (using a chart), but then the notions of addition and scaling this induces depend on your choice of representation.

The need for an observable's eigenvalues to be addable and scalable arises from the observable being a linear map from S = {system states} to V⊗S where V is the space of nominally possible values for the observable, among which the eigenvalues are the feasible (or, indeed, observable) possibilities. [The observable, (V⊗S: Q |S), then has to satisfy a structural relationship with a given symmetric metric, an antilinear mapping (dual(S): g |S) for which, for any s, t in S, t·g(s) and s·g(t) are mutually conjugate; the constraint says that, for any s, t in S, Q(s)·g(t) and Q(t)·g(s) must also be mutually conjugate; Q is then described as hermitian.] When the observable is position on our smooth manifold, M, V gets replaced by M, which is not a linear space, and the ⊗ tensor space combiner is not available to us: we cannot ask which mappings (M: |dual(S)) are linear. None the less, the position observable is clearly meaningful, so how may we describe it ?

Orthodox description of observables.

The natural way to approach this is to look at the conventional description and seek the extent to which it is free of the requirement that the observable is a vector quantity. To this end, note that the orthodox treatment in terms of diagonalisation of an observable is alternatively described as decomposition of the observable into a sum: each term in which is the (if necessary tensor) product of an eigenvalue of the observable with a projection operator (projector) which selects the eigenspace for that eigenvalue.

The sum of just the projectors, without multiplication by their eigenvalues, delivers the identity linear operator and these projectors commute with one another (the product of any distinct pair is zero). For any set of values in the space within which the observable's values lie, there is a projector equal to the sum of the projectors for those eigenvalues which lie in the set given.

The probability, when the system being described is in some given state, of finding the observable to have a value in some set turns out to be the result of contracting this projector for the set with the state's bra (on the left; its image, in dual(S), under g) and ket (on the right; the member of S representing the state). One may obtain a hermitian operator (actually a projector), with trace 1, from this bra and ket: their (tensor) product the other way round. The probability just cited is then the trace of the product of this hermitian operator with the projector for the given set of values for the observable.

If we do not know the state of the system but, instead, know a real probability measure over the possible states (loosely a probability density for what state the system is in) then we can integrate this ket tensor bra product using that measure: the result will also be a hermitian operator with trace 1 (but not, as far as I can see, necessarily a projector). Consequently, it is more natural to describe the state of the system in terms of a hermitian with trace 1 rather than in terms of definite state vectors: the probability of finding the system (for which we had a probability measure over possible states) to have its value for an observable in some set is still the trace of the product of the projector for the observable to be in that set times the hermitian operator with trace 1 associated with the system's state.

An alternative

The decomposition of the identity into a sum of commuting projectors commuting with our observable can, alternatively, be regarded as a (generalised) probability measure on the space within which the observable's values lie. The values taken by the measure lie in a commuting algebra of hermitian projectors. If the values taken by the observable do lie in some vector space (which will, naturally, be real given a hermitian observable) then the expected value of this probability measure comes to the observable itself. However, all other aspects of its use are liberated from dependence on the vectorial nature of the values taken by the observable.

Consequently, we can describe the position observable on a smooth manifold as a probability measure on that manifold, taking values in the space of hermitian projectors on the Hilbert space via which we describe our quantum mechanical states. You can, roughly, think of this as a probability density for the position, albeit the probabilities delivered are not real numbers between 0 and 1. Such numbers may be obtained, however, by contracting the projector delivered with the unit-trace hermitian operator describing the state of the system under study and taking the trace of the product. For any measurable subset of the smooth manifold, the measure yields a projector; the whole manifold is measurable and its projector is the identity on our space S of states. Any subset of the manifold in which the particle definitely isn't yields the zero projector. Any measurable (e.g. smooth and bounded) mapping, f, from the manifold to some fixed linear space U may be integrated using the measure to produce a linear map (U⊗S: |S) which may be contracted with the hermitian operator describing the system state to yield an expected value, in U, of f.

Algebra

We have S = {system states} represented as the unit sphere in a Hilbert space H with hermitian metric encoded by antilinear iso (dual(H)| g |H); we can do linear algebra in H and thus, to a certain degree, in S. We have a smooth manifold M; positions are points of M and not amenable to linear algebra.

Restating orthodoxy

An observable Q taking values in some fixed linear space V is encoded as a linear map (V⊗H: Q |H) for which Q(u)·g·v and Q(v)·g·u are mutually conjugate (when V is {scalars} this is equivalent to g(Q(u),v) = g(u,Q(v)) but, more generally, Q's outputs aren't in H for g to accept as inputs) for each u, v in V. Eigenvalues of Q are the v in V for which there is some non-zero h in H with Q(h) = v×h; in such a case, h is an eigenvector of H with eigenvalue v. It is possible to construct a basis b of H, whose members are all eigenvectors, that unit-diagonalises g. If the dual basis of dual(H) is p, with each p(i)·b(j) = 1 if i = j, else 0, then sum(: b(i)×p(i) ←i :) is the identity H; if conjugation of scalars is * then each *∘p(i) is an antilinear ({scalars}: |H) and sum(: p(i)×(*∘p(i)) ←i :) = g, so that each p(i) is in fact g(b(i)). Let (V: e :) give the eigenvalues of the b(i), so Q(b(i)) = e(i)×b(i) for each i; then Q = sum(: e(i)×b(i)×p(i) ←i :).

For each distinct output v of e we have ({v}: e |) as the set of basis-indices of basis members with v as eigenvalue; for other v in V, this set is empty. Define, for each v in V, h(v) = sum(: b(i)×p(i) ←i :({v}:e|)); this is the identity on the sub-space of H on which V has v as eigenvalue; it is, equally, the projection mapping from H to this sub-space. This defines h as a function ({linear (H:|H)}: |V); when v is not an output of e, h(v) is zero. Each h(v) is idempotent; composing it with itself yields itself. Any two distinct outputs of h have zero composite. Thus the outputs of h all commute with one another; indeed, all sums of outputs of h commute with one another. By grouping the i in (:b|) according to equality of v(i), we can re-write the identity sum(: b(i)×p(i) ←i :) = sum(: h(v) ←v |e) = sum(h), i.e. h's (non-zero) outputs constitute a partition of the identity H. We can likewise re-state Q = sum(: e×b×p :) as Q = sum(: v×h(v) ←v |e), whence it is easy to show that, modulo the tensor permutation operators needed to give meaning to h(v)∘Q, the composite of each h(v) with Q is, either way round, v×h(v); and, in particular, the same both ways round, hence each output of h commutes with Q. So the (non-zero) outputs of h provide a partition of the identity into commuting projectors that commute with our observable.

Each member of S is represented as a linear combination of our basis, u = sum(: s(i).b(i) ←i :) with sum(: s(i).*(s(i)) ←i :) = g(u,u) = 1. The expected value of Q in this state is then sum(: s(i).*(s(i)).v(i) ←i :) = Q(u)·g(u) and the probability that an observation of Q will yield value in some set E ⊂ V is sum(: s(i).*(s(i)) ←i :(E:e|)). If we define U = u×g(u) in H⊗dual(H) = {linear (H:|H)} then its trace is just g(u,u) = 1; the trace of Q∘U is Q(u)·g(u), the expected value of Q; and the trace of U·sum(: h(u) ←u |E) is the probability of Q's value being in E. As g(u,u) = 1, U∘U = u×g(u)·u×g(u) = u×1.g(u) = U is idempotent. As g·U = g(u)×*∘g(u) is conjugate-symmetric, U is hermitian (equivalently: each coefficient of a b(i)×p(i) in U is s(i).*(s(i)), hence real). U is thus the unit-trace hermitian projector mentioned above.

If we don't know the actual state in S, only a probability measure μ on S for which state the system is in (this is a classical probability, rather than a quantum one) we'll get μ(: Q(u)·g(u) ←u :S) as the resulting expected value of Q, which is the trace of Q's composite after U = μ(: u×g(u) ←u :S) and we can use this probabilistically blurred U exactly as we used the specific U above to obtain probabilities of Q taking values in particular sub-sets of V; while the sense in which μ is a probability measure is expressed by trace(U) = 1 exactly as before. (It is not clear that it is meaningful, in a quantum context, to include such a classical probability distribution in our discussion; but, if it is, this is what we get; and it fits perfectly sensibly with the foregoing.) The only difference is that, in this case, there is no strong reason to expect U to be idempotent; but it is a unit-trace hermitian (H:|H) which captures all the useful information about our state.

So we express the state of our system not as a member u of H's unit sphere for g but as a unit-trace hermitian (H:U|H) = u×g(u); many members of S may be expressed by the same U, but only U is actually relevant to determining the values of observables (i.e. the s(i) are not observable, but the s(i).*(s(i)) are, at least in principle).

Each subset, E, of V yields sum(: h(v) ←v :E) as the projector that identifies the sub-space of H spanned by eigenvectors of Q having eigenvalues in E; in effect, h serves as the density for a probability measure (albeit with idempotent (H:|H) values as outputs, instead of scalar ones) whose integral over E is the sum just given. Let P be this probability measure; the above sum is then P(: 1 ←v |E), a.k.a. P({1}:|E), and Q = sum(: v×h(v) ←v |V) is simply P(V), the integral of the identity (: v ←v |V). Composing with the unit-trace hermitian U that represents a state of the system, trace(U·P({1}:|E)) is the probability of observing Q in E when the prior state was represented by this U; and trace(U·P(V)) is the expected value of Q when in this state.

Extending to the manifold

So now let us consider position on the smooth manifold, M. Unlike Q, its values do not fall in a vector space. However, we can still represent the states of our system by members of H and, from such a representation, obtain a unit-trace hermitian (H:U|H) that encodes everything interesting. The only change we need is that, now, we replace (:h|V) and the measure P it induces on V by a measure P on M. For each measurable subset E of M, P({1}:|E) is an idempotent (H:|H) projecting onto the subspace of H consisting of states for which the particle is in E; for two such sub-sets, P({1}:|E)·P({1}:|F) = P({1}:|E∩F), the corresponding projector for the intersection of E and F. For any function (V:Q|M) from M to a linear space V, we can use this measure to obtain P(Q) linear (V⊗H:|H) which we can contract with the unit-trace hermitian U encoding a state to obtain the expected value of Q in that state; in particular, as before, trace(P({1}:|E)·U) gives the probability of observing, when previously in a state encoded by U, the particle's position to be in E.

A chart is a function from a neighbourhood in M to a vector space; we can extend it to a mapping from the whole of M to that vector space, e.g. by mapping all of M outside the neighbourhood to the vector space's origin; as long as the result is reasonably well-behaved (it need not be continuous), the result shall be integrable, yielding a vector representation of position in M, albeit one only even remotely meaningful when the system is in a state with negligible probability of the particle lying outside the neighbourhood covered by the chart. This is the position that orthodoxy has always effectively used.

The momentum observable

Next we must, naturally, consider the momentum observable. Here, again, we have problems on a smooth manifold: although momentum is a vector quantity, the quantity in question is a tangent vector to our manifold, so has meaning only at each point; there is no intrinsic way to compare, much less combine (e.g. average), its values at different positions. We can, of course, use a chart to reduce positions to vectors and, thus, momenta likewise; this is inevitably what orthodoxy does, so we need to match orthodoxy up to this point and then see what we can have, in the manifold's terms, that looks enough like it to be meaningful.

Momentum is, in quantum mechanics, identified with a wave co-vector that corresponds to a gradient of phase, hence also identified with the differential operator (specifically, among all the possible differential operators on M, the one that annihilates the space-time metric), D. There is no intrinsic M-ness in U, to which to apply D, but the typical observation is now of form trace(P(somethng)·U), where the something is at least roughly of form sum(: x(i)×b(i)×p(i) ←i :) for some mutually dual pair of bases (H:b:) and (dual(H):p:), with each x(i) vectorial in some sense, e.g. a mapping from M to some fixed linear space, which may contain M-ness on which D can act; and this may couple, via our bases, with U's decomposition via these bases. Archetypically, we use a basis of position eigenvectors, so that i ranges over M, imposing an M-ness on U and giving it the form U = integral(: s(m).b(m)×p(m).*(s(m)) ←m |M); then the momentum operator maps this to (an imaginary scalar multiple of) the trace of D(u)×g(u) = integral(: D(s(m))×b(m)×p(m).*(s(m)) ←m |M), so we need P(something) to act on U as something that looks like this.

Written by Eddy.