Unitary Transformations

Since they show up as handy models of various Lie Groups, I should discuss what are known as unitary transformations. These presume a complex linear space, V, and an anti-linear map, g, from V to its dual – so g's outputs are linear maps from V to (complex) scalars. I discuss the relevant conjugate-transposition of antilinear maps, T, and inner product operations elsewhere; T(g,u,v) and g(v,u) are mutually conjugate for any u, v in V and g is described as symmetric precisely if g = T(g), i.e. g(u,v) and g(v,u) are mutually conjugate for any u, v in V. On linear maps, T is defined simply as transposition (without the conjugation), with the result that T(f&on;h) = T(h)&on;T(f) for any linear or antilinear h and (:f|h). I shall not, for the present at least, assume that g is positive definite; I shall mainly focus on the case where it is symmetric and invertible.

A linear map (V: h |V) is described as Unitary for g, or g-unitary, if g(h(u),h(v)) = g(u,v) for all u, v in V. Construing g as describing a metric specifying a geometry, this is simply an isometry – a transformation which doesn't change the geometry. Now, g(h(u),h(v)) = h(v)·g(h(u)) = T(h, g(h(u)), v) = (T(h)&on;g&on;h)(u, v), so we can re-write the specification as: h is g-unitary iff

T(h)&on;g&on;h = g

When h and g are invertible, this can be re-written as reverse(g)&on;T(h)&on;g = reverse(h), i.e. h's inverse is its g-conjugate. However, the specification above, while implying this for the invertible case, is more generally applicable. When g is symmetric T(h)&on;g = T(h)&on;T(g) = T(g&on;h), so h is g-unitary precisely if T(g&on;h)&on;h = g; in the invertible case, this can be re-written as T(g&on;h) = g&on;reverse(h).

We could equally use the same equation for antilinear h; however, it then suffices that there be even one antilinear g-unitary isomorphism (V:c|V), upon which each antilinear g-unitary h is the composite of c with some linear g-unitary (V:|V), namely the result of composing h with c's inverse (which is g-unitary, as I'll show shortly); c thus induces an isomorphism between the g-unitary linears and antilinears. Since a composite of two antilinears is linear, rather than antilinear, and I want to compose unitary mappings, the linear g-unitary mappings present themselves as the more convenient of the pair of isomorphic notions.

Group

Suppose (V:h|V) and (V:f|V) to be g-unitary, so T(h)&on;g&on;h = g = T(f)&on;g&on;f, and consider e = f&on;h giving

T(e)&on;g&on;e: = T(f&on;h)&on;g&on;f&on;h; = T(h)&on;T(f)&on;g&on;f&on;h; = T(h)&on;g&on;h; = g

so that any composite of g-unitary mappings is g-unitary. Note that the same chain of equations would apply if either or both of f and h were antilinear, justifying my earlier claim that a composite of antilinear unitaries is a linear unitary; indeed, any composite of unitaries will be unitary, and either linear or antilinear according as an even or odd number of the unitaries composed were antilinear.

Suppose (V:f|V) is g-unitary, so T(f)&on;g&on;f = g. When f is a mapping, f&on;reverse(f) is an identity; so g&on;reverse(f) = T(f)&on;g. Applying T to both sides, T(reverse(f))&on;T(g) = T(g)&on;f, whence T(reverse(f))&on;T(g)&on;reverse(f) = T(g). Thus reverse(f) is T(g)-unitary, provided only that it is a mapping; when g is symmetric, the inverse of an invertible g-unitary is g-unitary. Furthermore, if g is monic (i.e. non-singular, so reverse(g) is a mapping) then every g-unitary mapping is non-singular, hence invertible. [Proof: if f maps u and v to the same value then f(u−v) is zero; T(f)&on;g respects addition so maps zero to zero; if f is g-unitary, this says that T(f)&on;g&on;f = g maps u−v to zero; but g(zero) is zero and g is given to be monic, implying u−v = zero, whence u = v; QED.] Thus, for invertible (dual(V):g|V), all g-unitary maps are invertible; if g is also symmetric, all g-unitary maps thus have g-unitary inverses.

It's trivial that the identity is unitary; we have inverses and closure under composition; so we can duly infer that, for each symmetric invertible g, the g-unitary maps form a group. Hereafter, I shall presume that g is symmetric and invertible.

In the special case where V is {lists ({complex}:|n)} for some natural n (this is the standard n-dimensional complex vector space) and g is the standard positive-definite form defined by

g(u,v) = sum(: v(m).*u(m) ← m |n)

the group of g-unitary transformations of V is denoted U(n). In particular, the group of complex phases, {complex k: k.*k = 1}, interpreted as scalings, serve as a canonical representation of U(1).

Eigenvalues and eigenvectors

Suppose h(u) = k.u and h(w) = m.w; if h is g-unitary, whence g(u) is T(g&on;h, h(u)),

g(u,w): = T(g&on;h, h(u), w); = *g(h(w),h(u)); = *g(m.w,k.u); = k.(*m).*g(w,u)

which, for symmetric g, is just k.(*m).g(u,w); whence either g(u,w) is zero or k.(*m) is 1. In particular, with w = u, any non-null (i.e. g(u,u) is non-zero) eigenvector of h has a pure phase as its eigenvalue, k.(*k) = 1; and is g-orthogonal (i.e. g(u,w) = 0) to any eigenvector of h with a different eigenvalue.

Thus, at least when g is positive-definite (or negative-definite; either way, g(u,u) can only be 0 if u is 0) and symmetric, a g-unitary map has pure phases as its eigenvalues, and eigenvectors with different eigenvalues are g-orthogonal. Consequently, if a unitary mapping is diagonalisable, its diagonal entries will all be phases. As a result, its determinant must also be a phase.

Now, for any g-unitary f and any phase k (including, notably, the conjugate of f's determinant), we have

T(g&on;(k.f))&on;(k.f) = *k.T(*k.g&on;f)&on;f = k.*k.T(g&on;f)&on;f = g

since k.*k = 1. We can thus express any g-unitary f as the product of a phase and a g-unitary whose determinant is 1. Since the determinant of a product is the product of determinants of the factors, and 1 is self-inverse, the g-unitary maps with determinant 1 form a sub-group of the g-unitary maps; so we are able to express this group in terms of the group of phases and the group of its members with determinant 1. This last group is known as the special unitary group for g.

When V and g are those of U(n), this special unitary group is denoted SU(n). Notice that U(1) includes, for each positive natural n, n members which power(n) maps to 1: multiplying any member of U(n) with determinant 1 by any of these phases yields another member of U(n) with determinant 1. Thus the mapping (U(n): k.f ← [k,f] :Rene([U(1),SU(n)])) covers U(n) n times. It implicitly gives us an equivalence relation on SU(n) which treats two members as equivalent if each is the result of multiplying the other by a phase (one of our n-th roots of 1); each equivalence class then has exactly n members.

Example in 2-dimensions: U(2)

Consider the 2-dimensional vector space V = {list ({complex}:|2)}, with the obvious metric on it, g = (: (: u.(*x) +v.(*y) ←[u,v] :) ←[x,y] :).

Base Chart

A typical linear (V:|V) will be h = (: [a.x+b.y, c.x+d.y] ←[x,y] :), which is g-unitary iff, for all complex u, v, x, y:

u.(*x) +v.(*y): = (a.u+b.v).*(a.x+b.y) + (c.u+d.v).*(c.x+d.y); = (a.(*a) +c.(*c)).u.(*x) +(a.(*b) +c.(*d)).u.(*y) +(b.(*a) +d.(*c)).v.(*x) +(b.(*b) +d.(*d)).v.(*y)

whence we require

a.(*a)+c.(*c) = 1 = b.(*b)+d.(*d) and
a.(*b)+c.(*d) = 0 = b.(*a) +d.(*c).

Given that 0 = *0, the last pair of equalities only contribute one constraint; c.(*d) = −a.(*b), whence c.d.*(c.d) = a.b.*(a.b) whence the first equality implies

d.*d = d.*d.(a.*a+c.*c) = a.*a.(d.*d+b.*b) = a.*a,

whence c.*c = b.*b. From c.(*d) = −a.(*b) we can also infer that the phases of −c.b and a.d are equal; their magnitudes are c.*c = b.*b and a.*a = d.*d, respectively, whose sum is 1; so the determinant, a.d −b.c is just the phase of a.d and of −c.b.

Thus h's determinant is a phase (as expected), as are the ratio of its diagonal elements and the ratio of its off-diagonal elements; the magnitudes of a diagonal element and an off-diagonal element have squares which sum to 1, so can be taken to be the Sin and Cos of some angle.

We can thus express our g-unitary h in terms of an angle t and phases φ, j, k, giving a = φ.j.Cos(t), d = φ.Cos(t)/j, b = −φ.Sin(t)/k, c = φ.k.Sin(t) so that

h = (: φ.[x.Cos(t).j −y.Sin(t)/k, x.Sin(t).k +y.Cos(t)/j] ←[x, y] :)

is our general g-unitary (V:|V). The determinant of this is just φ.φ. The basic form is instantly recognizable as that of a rotation in two (real) dimensions, distorted by some phases. Scaling φ by −1 is equivalent to scaling both j and k by −1 or, equally, to adding a half turn to t; so the general form just given has some redundancy in it. Indeed, there's more: negating t is equivalent to negating k; combining this with the above gives yet more equivalences. Replacing φ and j with their inverses (i.e. conjugates) and negating t yields the inverse of h.

This gives us enough charts to cover U(2); fixing φ as 1 gives us charts of SU(2). Allow t to run over the full range of angles; but note that only half of that range is of interest, since negating h produces an equivalent member of SU(2). For the sake of definiteness, have t range from minus quarter turn to plus three quarter turns with attention on the first half, which is centred on t = zero. Take i as the canonical square root of −1, whose phase is turn/4. Allow each of j and k to run only over half the possible phases, say the ones with positive real part, plus i (but not −i). We are then able to get every member of SU(2) without duplication, aside from once via equivalence between the two halves of t's range; and we have the identity at the centre of the resulting global chart.

Lie Algebra

The Lie Algebra associated with a Lie Group is the tangent bundle to the Lie Group at its identity equipped with an antisymmetric multiplication induced from the commutators of members of the Lie Group near the identity.

In the immediate neighbourhood of the identity, j = 1 = k and t = zero, we can identify small perturbations in t, j and k as changing h by small multiples of

p = (: [−y, x] ← [x, y] :)
q = (: [i.x, −i.y] ← [x, y] :)
r = (: [i.y, i.x] ← [x, y] :)

Notice that: t = turn/4 (making j irrelevant), k = 1 yields p = h; t = 0 (making k irrelevant), j = i yields q = h; and t = turn/4, k = i yields r = h; so all of these tangents at the identity are incidentally g-unitary in their own right. Further, p·q = r, q·r = p, r·p = q and p·p = q·q = r·r = −1. [Formally, this means we're dealing with the Quaternions.] These imply q·p = q·q·r = −r; similarly, p·r = −q and r·q = −p. When we come to take powers of members of span({p,q,r}), all cross-terms (multiplying two of the generators) show up in matched pairs with order swapped so cancel.

Next (for reasons arising from the general relationship between a Lie Group and its tangent bundle at its identity) pause to look at what the transcendental function exp turns each of our tangents into. Now, exp (at least for the purposes of the said general relationship) is defined by

exp = (: sum(: repeat(n, x) / factorial(n) ←n :{naturals}) ← x :)
i.e. exp(x) = 1 +x +x.x/2! +x.x.x/3! +x.x.x.x/4! +…

Thus, for x in {p, q, r} and z scalar, we have

exp(z.x) = 1 +z.x −z.z/2! −z.z.z.x/3! +… = cos(z) +x.sin(z)

just like for x = i. In particular, each of p, q, r thus gives us (by using it in place of i) an embedding of U(1), the complex plane's unit circle, in our group of unitary mappings. Now,

for j = 1 = k, we have h = Cos(t) +p.Sin(t) = exp(p.t/radian)
for t = zero, j = exp(i.e), we have h = exp(q.e); if we compose p after this, we get p·exp(q.e) as what h would be for t = turn/4, k = exp(i.e)
for scalar z, exp(z.r) = cos(z) +r.sin(z) corresponds to h with j = 1, t = z.radian and k = i.

Next, observe

(p.e +q.z +r.d)·(p.e +q.z +r.d): = −(e.e +z.z +d.d) +e.z.(p·q +q·p) +e.d.(p·r +r·p) + z.d.(q·r + r·q); = −(e.e +z.z +d.d)
since p·q = r = −q·p and similar for the other lost terms. All even powers of p.e +q.z +d.r are then suitable powers of this scalar, and all odd powers are just p.e +q.z +d.r times suitable even powers. Thus
exp(p.e +q.z +r.d): = 1 +p.e +q.z +r.d −(e.e +z.z +d.d)/2! −(p.e +q.z +r.d).(e.e+z.z +d.d)/3! +(e.e+z.z +d.d).(e.e+z.z +d.d)/4! …; = 1 −(e.e +z.z +d.d)/2! +(e.e+z.z +d.d).(e.e+z.z +d.d)/4! −… +(p.e +q.z +r.d).(1 −(e.e+z.z +d.d)/3! +(e.e+z.z +d.d).(e.e+z.z +d.d)/5! −…; = cos(√(e.e+z.z +d.d)) +(p.e +q.z +d.r).sinc(√(e.e+z.z +d.d))
where sinc is defined to be (: sin(x)/x ←x :{scalars}), smoothed at 0 to yield sinc(0) = 1, the value sinc(x) tends to as x tends to 0.
exp(p.e)·exp(q.z): = (cos(e)+p.sin(e))·(cos(z)+q.sin(z)); = cos(e).cos(z) +p.sin(e).cos(z) +q.cos(e).sin(z) +r.sin(e).sin(z)
exp(q.z)·exp(p.e): = (cos(z)+q.sin(z))·(cos(e)+p.sin(e)); = cos(e).cos(z) +p.sin(e).cos(z) +q.cos(e).sin(z) −r.sin(e).sin(z); = exp(p.e)·exp(q.z) −2.r.sin(e).sin(z)

This derivation didn't depend on anything but the cyclicly permutable truths given above, so we can infer

exp(q.z)·exp(p.e) +2.sin(e).sin(z).r = exp(p.e)·exp(q.z)
exp(r.z)·exp(q.e) +2.sin(e).sin(z).p = exp(q.e)·exp(r.z)
exp(p.z)·exp(r.e) +2.sin(e).sin(z).q = exp(r.e)·exp(p.z)

Continuing with the derivation which yielded these, consider

r·exp(−q.z)·exp(−p.e): = r·(cos(e).cos(z) −p.sin(e).cos(z) −q.cos(e).sin(z) −r.sin(e).sin(z)); = r.cos(e).cos(z) −q.sin(e).cos(z) +p.cos(e).sin(z) +sin(e).sin(z)
whence
1 −exp(q.z)·exp(p.e)·exp(−q.z)·exp(−p.e): = 2.sin(e).sin(z).(r.cos(e).cos(z) −q.sin(e).cos(z) +p.cos(e).sin(z) +sin(e).sin(z))

so that, for tiny values of e and z (each indistinguishable from its sin, and having cos indistinguishable from 1) the leading term in the commutator of exp(q.z) and exp(p.e) is 2.e.z.r. Indeed, if we describe e.p +z.q +d.r as a unit precisely when √(e.e +z.z +d.d) is 1 and have two unit tangents S and T at the identity, whence S·S = −1 = T·T, and scalars s and t,

exp(s.S)·exp(t.T)·exp(−s.S)·exp(−t.T): = (cos(s) +S.sin(s))·(cos(t) +T.sin(t))·(cos(s) −S.sin(s))·(cos(t) −T.sin(t)); = cos(s).cos(s).cos(t).cos(t) +(S·T −T·S).sin(s).sin(t).cos(s).cos(t) +(T·S·T +S).cos(s).sin(s).sin(t).sin(t) −(T +S·T·S).cos(t).sin(t).sin(s).sin(s) +S·T·S·T.sin(s).sin(s).sin(t).sin(t)

which, for tiny values of s and t, differs from the identity simply by s.S·t.T −t.T·s.S plus terms with at least one more factor of s or t than this one.

further notes on the tangent bundle at the identity, su(2) = span({p, q, r}): we can define an inner product by G(x,y) = −trace(x·y)/2 which is diagonalized by [p, q, r] = T, yielding su(2) = (: sum(: −trace(T(i)·x).T(i)/2 ←i |3) ←x :su(2)) and (allegedly) exp only needs its inputs' co-ordinates to (each) have range 2.π to satisfy (SU(2)|exp:su(2)).

Written by Eddy.