For any tensor product in which two of the terms pertain to dual spaces, W⊗V⊗X⊗dual(V)⊗Z, there is a natural (i.e. choice-independent) linear map to W⊗X⊗Z, induced by the mutual action of the dual terms. A general member of the initial space is a sum of terms each of which is of form w×v×x×u×z with w, v, x, u and z in W, V, X, dual(V) and Z respectively. The image of such a term under the mutual action is simply (u·v).w×x×z, in which (u·v) is a scalar; the image of a sum of such terms is obtained by summing the images of the individual terms. Since the action on each term respects linearity, the resulting linear extension to the whole space is simply a linear map. This linear map is called trace.
When none of the vector spaces W, X, Z are dual to one another or involve V, or when they are all the field and our product is just that of V with its dual (which is {linear (V:|V)}), one may fairly use the name trace without ambiguity: given that it is acting on a tensor product in which just one dual pair of factors arises, it is clear which pair it is to contract. However, when those special conditions are not met, it is necessary to have some clear notation for specifying which dual pair of factors in the tensor product is to be contracted.
It is entirely sufficient to write out a factorisation of the tensor product, such as the X,V,W,dual(V),Z one above, and mark the dual pair in some way. To this end, we could name the above linear map trace[W,V*,X,dual(V)*,Z]. It can readily be seen that two trace operations acting on separate terms in the tensor product can be combined, and commute: if Z were dual(X), we could have trace[W,V*,X!,dual(V)*,Z!]. We could legitimately employ this notation with arbitrarily many (distinct) pairs of matched markers (but, in fact, I'll be constructing a quite different denotation shortly, once I've introduced everything it can describe).
For any mapping (T:V|I) with I a well-ordered index set and T some collection of vector spaces, one can construct a vector space, the tensor product of V, bulk(⊗, V), which may be thought of as V(first)⊗…⊗V(last). For any permutation σ of I, we have likewise the vector space bulk(⊗, V∘σ). Now, a typical member of bulk(⊗, V) is a sum of terms each of form bulk(×, v) for some (V(i): v(i) ←i |I), for which bulk(×, v∘σ) is in bulk(⊗, V∘σ). This gives us a natural isomorphism (bulk(⊗, V∘σ): bulk(×, v∘σ) ←bulk(×, v) :bulk(⊗, V)), extended linearly from its given example action. For the moment, call this shuffle(σ), thereby defining a mapping shuffle from permutations to polymorphic linear operators on tensor products.
Now, any ordered finite set is naturally isomorphic (as an ordered finite set) to some natural number and any permutation on a natural number is wholly specified by an ordering of its members. (Thus: [0,1,2,3] is the identity on 4 and [1,2,3,0] is one of the obvious cyclic permutations of it. Any list (:g|4) = [g(0),g(1),g(2),g(3)] can be composed after this to give g∘[1,2,3,0] = [g(1),g(2),g(3),g(0)], e.g. [a,b,c,d]∘[1,2,3,0] = [b,c,d,a]. Thus a permutation, (:p|n), acts on other lists of the same length as g∘p ←(:g|n).) Consequently, we could (but, as for trace, I'll actually discard this in favour of something more general, below) write a permutation's action on, say, U⊗V⊗W in the form:
In general, a tensor space may be dealt with in terms of various factorisations (e.g. a consecutive pair of factors U⊗dual(V) might sometimes be treated as a single {linear (U:|V)} factor), so it is necessary to have shuffle's input list encode both the factorisation being used and the re-ordering to apply; hence [U2,V0,W1] instead of simply [2,0,1].
It remains to note that trace and permute may be combined: we can, if we wish, trace out some components of a tensor product while permuting the remainder. Extending the example used for tracing, we could now define trace-permute[W1,V*,X2,dual(V)*,Z0] = (Z⊗W⊗X: (u·v).z×w×x ← w×v×x×u×z :W⊗V⊗X⊗dual(V)⊗Z). This trace-permute operator, albeit differently written, will show up plentifully in the course of my analysis of tensor bundles: it corresponds with shuffling the order of indices (permutation) and summing over repeated indices (tracing) in orthodox notations for physics.
The use of things like W1
and V*
as entries in lists is
uncomfortably informal – they arise from the need to declare the
tensor-factorisation to be applied, i.e. both the rank on which trace-permute
is acting and the particular interpretation of it to which to apply the
tracing and permutation. Each of our spaces X, Y, Z might itself be a tensor
product of spaces, some of which might involve V and its dual; the space we've
been referring to as W⊗V⊗X⊗dual(V)⊗Z could be
expressible as a match to this pattern in several ways: each of which will
give meaning to trace-permute([1,*,2,*,0]), and each a different meaning. To
get round this informality/ambiguity, I now introduce a general mapping τ
which takes two lists, of equal length, and yields an unambiguous linear
map.
The first argument to τ describes the permutation and tracing to be performed; the second is a list of linear spaces describing the tensor factorisation to be used on inputs. The first argument's entries must be symbols (which can be anything that doesn't have meaning as a natural number), appearing in pairs, and natural numbers, each appearing exactly once, with every natural smaller than any natural used being also present in the first argument. Where the first argument uses a symbol, the second argument's two entries in positions corresponding to those taken by the symbol in the first must be linear spaces dual to one another. Thus:
e in s and {i,j} = ({e}:p|)implies
V(i) and V(j) are mutually dual
Thus our earlier example, (Z⊗W⊗X: (u·v).z×w×x ← w×v×x×u×z :W⊗V⊗X⊗dual(V)⊗Z), is τ([1,*,2,*,0],[W,V,X,dual(V),Z]). Note that, for any suitable list s, τ(s) is a mapping from suitable lists of linear spaces to linear maps from the tensor product of the listed spaces to a suitably derived tensor space. Formally:
is a collection of lists: τ accepts a list s
precisely if, for some natural n and symbol set
I, (|s:) is the union
of n and I, with ({m}:s|) a singleton for each m in n and ({i}:s|) is a
doubleton for each i in I – that is, the numbers appear once each, the
symbols twice each. Note that each ({m}:s|) is a singleton
iff the
restriction (n:s:) is one-to-one, which gives us r = reverse(n:s:) with
s∘r = n.
More or less anything will do as a symbol set
, so long as none
of its members are natural numbers: I've mostly used symbols but one can
equally use letters (so long as the letter isn't in use as a name in some
pertinent context, of course).
in such a case will accept, as input, any list ({linear spaces}:V|(:s|)) for which, whenever distinct i, j in (:s|) have s(i) = s(j), the corresponding V(i) and V(j) are dual to one another. The list V∘r consists of the V(i) for which s(i) is in n, so not in I, shuffled into the order indicated by the positions of n's members in s.
is then a linear map (bulk(⊗, V∘r): |bulk(⊗,V)): its action on bulk(×,v) yields the product of:
This equips us with a suitably powerful sledgehammer to describe all possible trace-permutation operators on linear spaces, using only generic notational forms. It remains that it is somewhat ugly and frequently more verbose than we might wish for; in particular, it often obliges one to reiterate things that are likely obvious from context.
Because τ(p,V) knows that it acts on bulk(⊗,V), we can unambibuously extend its action to bulk(⊗,V)⊗Anything by mapping any sum of terms, each of form w×a with w in bulk(⊗,V), to the corresponding sum of terms of form τ(p,V,w)×a. Of course, when w×a is met as such, this is no gain: but when we are dealing with some member of bulk(⊗,V)⊗Anything for which we haven't introduced such a factorisation (or, generally, a decomposition as a sum of products), it helps to be able to apply τ(p,V) to the thing itself rather than making the decomposition explicit in order to express the action of τ(p,V). Example: the Riemann tensor is in bulk(⊗, [G,G,G,T]) and antisymmetric under the associated τ[1,0,2,3]; it can be written as a sum of terms of form (g^h)×f×t in which each g^h is antisymmetric under τ([1,0],[G,G]); since the permutation only changes the order of the leftmost factors, it seems superflous to waste space mentioning the unaffected factors; so I want to be able to write τ([1,0],[G,G],R) = −R. The present tacit extension allows that.
In many situations the relevant factorisation, i.e. τ's second argument, is implicitly evident from context. In such cases, I'll write τ[1,*,0,*,2], eliding a pair of (…) curved prackets and the implicit factorisation of the space, as a short form for the result of giving τ the indicated permutation-with-symbols list and the implicit list of spaces factorising whatever the result is to be applied to.
One can recover a close semblance of Einstein's relativistic notation, in the case where all tensor spaces are given to be factorised in terms of some given dual pair of spaces G and T (in relativistic notation, G shall be the gradients, a.k.a. covectors or covariant vectors, while T shall be the tangents, a.k.a. (contravariant) vectors). In such a context, the result of applying τ(σ, ({G,T}:F:)) to a tensor-valued expression is expressed by writing out σ's entries, after the expression, as subscripts and superscripts (with intervening space but no punctuation): if the corresponding entry in F is G, the entry in σ is written as a subscript; otherwise, the entry in F is T and the entry in σ is written as a superscript. It is usual, when using this form, to use letters for the symbols in σ. The subscripts and superscripts are known, collectively, as indices. For example,
while Riemann_{0 1 2}^{3} is just a long-winded way of writing Riemann, applying the identity permutation and no tracing, but making clear that its rank is [G,G,G,T].
Note that, unlike relativistic orthodoxy, no basis is
tacitly implicated, with which to take co-ordinates, and the natural
(i.e. non-symbol) indices are in no sense related to the dimension of the
manifold (they run from zero to one short of the rank of the result
tensor). I also make no provision for lowering
or raising
indices – used by orthodoxy as a short-hand for contracting with the
metric or its inverse – as I firmly believe that the metric is an
important tensor which should not be hidden from view behind syntactic
sugar. I likewise don't provide for comma or semicolon to separate indices
due to differential operators; the operator itself should appear explicitly
and the relevant index can adorn it. Thus applying the differential operator
D to a tensor F^{0 1} (i.e. of rank T⊗T) and tracing F's second
factor with the factor D adds can be written D_{i}F^{0 i}
without need of punctuators.
It is, however, natural to carry over orthodoxy's notation for symmetrisation and antisymmetrisation (defined below) among indices; this gives special meaning to parentheses (for symmetrisation) and brackets [for antisymmetrisation] in the subscripts or superscripts of a tensor expression. For these purposes, subscripts and superscripts are considered separately – it is only possible to (anti-)symmetrise over tensor factors corresponding to the same space, either G (subscripts) or T (superscripts); but it is entirely possible to (anit-)symmetrise over some factors of one type at the same time as (anti-)symmetrising over some factors of the other type. Thus F_{(0 1)}^{[2 3]} tells us that F is of rank G⊗G⊗T⊗T and symmetrises over its G factors while antisymmetrising over its T factors; while F_{(0 2)}^{[3 1]} would say the same about F's rank and perform the same symmetrisation and antisymmetrisation on F, but would further permute its tensor rank to G⊗T⊗G⊗T, cycling the last three factors. Equally, one may apply (anti-)symmetrisation to two ranges of subscripts or superscripts, provided they do not overlap; for example, Riemann_{[0 1] [2}^{a}·g_{3] a} contracts Riemann's last tensor rank with g's second and antisymmetrises the result both over its first two ranks and over its last two ranks. Enclosing indices in this way when introducing a tensor can serve to alert the reader to the fact that the tensor is unchanged by the given (anti-)symmetrisation.
It also makes sense to provide for use of the index notation in conjunction with the division operators / (whose right operand is inverted) and \ (whose left operand is inverted). In such expressions, the inverted tensor has factors of T and G swapped from its uninverted original, so subscripts and superscripts are interchanged; although the metric g_{(0 1)} is of rank G×G, it contributes superscripts (i.e. factors of T) when inverted, as in Riemann_{[0 1] a}^{[3}/g^{2] a}, where g's inverse's second tensor factor is contracted with Riemann's third factor, leaving g's inverse's first tensor factor to be anti-symmetrised with the final factor of Riemann and, in the process, permuted into the position of its previously-contracted factor. (As it happens, both antisymmetrisations here are fatuous, thanks to properties of the Riemann tensor.)
Where a tensor's rank includes several factors of some common base rank, permutations of the tensor's rank which only interchange these factors all yield values of the same rank as the original, hence compatible with it for addition or subtraction.
In the simplest case, a tensor of rank V⊗V, when acted on by τ[1,0], yields a (generally different) tensor of the same rank; if we add this to the original (or take their average) we obtain a tensor of the same rank that is invariant under τ[1,0]. Alternatively, if we subtract the permuted result from the original (and optionally halve, by analogy with taking their average), we obtain a tensor on which τ[1,0] acts as negation. For such a simple second-rank tensor, the original tensor can be recovered from these two results, so that every V⊗V tensor can be written as a sum of two terms, one invariant under τ[1,0], the other negated by it.
Although the general case may interleave a repeated factor,
V, with other factors to make a tensor rank, it is always possible to permute
the tensor rank to put all the repeated factors at the left; this yields a
tensor whose rank is bulk(⊗, ({V}:|n))⊗Something, for some n,
and the Something
is uninvolved in the permutations of the various
factors of V, so can be ignored for the present discussion; whose results can,
using the inverse of the original permutation, be converted back to the
original rank. It thus suffices to discuss tensor ranks of form
bulk(⊗, ({V}:|n)), always bearing in mind that this may be combined
with other tensor factors uninvolved in the discussion. Likewise, although a
tensor rank's factorisation may include more than n copies of V, we are always
at liberty to treat some of these as uninvolved and attend only to some
sub-set of the copies. In the following, I thus restrict attention to the
case where all tensor ranks are the same and so may be implicated in the
permutation.
When we have n copies of a space V in our tensor factorisation, there are n! permutations that interchange only these copies of V (where 0! = 1 and (1+i)! = i!.(1+i) for each natural i; so 1! = 1, 2! = 2, 3! = 6, 4! = 24 and so on; n! grows faster than exponentially with n). Given a tensor of the relevant rank, we can apply each of these permutations to it; in general, each produces a different value; we can combine these by scaling and summing in any way we like. Aside from any over-all scaling (applied equally to all terms in the sum), we may reasonably incorporate a factor in each term derived from the permutation used to produce it. Since the permutations of a set form a group (under composition), it is natural to derive such a factor in some way that respects this group structure, embedding it in the multiplicative structure of scaling.
As it happens, there are exactly
two (when n > 1; else only one) group homomorphisms from permutations to a
multiplicative group in which a.a = 1
implies a in {±1}
;
one maps every permutation to 1, the other maps half of them (those
expressible as a composite of an odd number of two-item swaps) to −1 and
the other half to 1. The latter is called sign, the signature
homomorphism. (In the case where n ≤ 1, i.e. n in {0, 1}, we can still
define sign; however, the only available permutation is the identity, which
necessarily has signature 1, so sign is identical to the map-all-to-one
homomorphism.)
There are, consequently, exactly two natural ways to combine all permuted variants of a tensor, give or take over-all scalings; either apply each available permutation and sum the results; or apply each permutation, multiply by the signature of the permutation and sum the results. The former, regardless of input, will yield a tensor invariant under all of the permutations used. The output of the latter, in contrast, when permuted, will be negated or invariant according as the permutation used has signature −1 or +1; because sign is a homomorphism, applying a permutation has the effect of scaling by that permutation's signature. Thus, if we apply the signature-scaled sum over permutations to its own output, as re-input, each permutation gets one factor of its sign (because that's the scaling we used) times a second (because that's the effect the permutation has on the tensor on which it's acting) and hence just contributes the re-input value towards the final sum. The without-sign sum over permuted values likewise garners its re-input value once per permutation, when given its own output as re-input. Thus taking the average over permutations, instead of simply summing, will cause each permuting combination to be idempotent – that is, to act as the identity on its own outputs.
Since outputs of the without-sign combination are unchanged by,
or symmetric under
, interchange of tensor factors, we term this
combination symmetrisation
. Swapping a pair of tensor factors, in
contrast, always changes the sign of an output of the with-sign combination;
changing the sign produces an opposite
value, so this is
termed anti-symmetrisation
. We thus define (using linear extension to
induce action on a tensor space given action on simple products in that space)
functions
the only difference in which is the sign(p) term in the latter. Now – for possibly distinct naturals n, m and some given linear space U – examine
We can extend s and t (in practice, only one of them) to have equal
(:|), say N = max(n,m) = n∪m: if N = n+h, extend s to s∪(N: n+i ←
n+i :N) a.k.a. s extended with the identity
; similarly for t. This
doesn't change sign(s) or sign(t), and (because sign is a homomorphism)
makes
As at least one of n, m is N – say n=N, so s ranges over all of {iso (N:|N)} – and any p in {iso (N:|N)}, including our (possibly extended) t, is invertible, so that ({iso (n:|n)}| s∘p ←s |{iso (n:|n)}) is one-to-one, making sum(: formula(s∘p) ←s :{iso (n:|n)}) the same thing (provided addition is Abelian) as sum(: formula(s) ←s :{iso (n:|n)}) for any formula. Consequently, if n=N, our summed quantity is independent of t, so that summation over {iso (m:|m)} just gives m! copies of each value, exactly cancelling the /m! and leaving us with sum(: sign(s).τ(s, ({U}:|N)) ←s :{iso (N:|N)})/N!, which is just antiSym({U}:|N). We get exactly the same sum, using the name t in place of s, when (instead of n=N) we have m=N, making s redundant. Hence
For any (U:u|n) we have antiSym(({U}:|n), bulk(×, (U:u|n))) which (especially when n is small) is also written as u(0)^…^u(n−1), so antiSym(({U}:|2),a×b) = a^b = (a×b −b×a)/2. Similarly with n=3, antiSym(({U}:|3),a×b×c]) = a^b^c. The space of wholly antisymmetric n-tensors over U is (| antiSym({U}:|n) :bulk(⊗,({U}:|n))), which I'll also write U∧U∧…∧U with n appearances of U in it. We've just seen that each antiSym({U}:|m) with m≤n acts as the identity on this space: each member, x, is antiSym(({U}:|n),w) for some w in bulk(⊗, ({U}:|n)) so antiSym(({U}:|m),x) = (antiSym({U}:|m)∘antiSym({U}:|n))(w) = antiSym({U}:|n)(w) = x.
Written by Eddy.