## Affine connection for axial symmetry

Suppose you have an axially symmetric vector field. Can we define an affine connection which keeps the vectors “parallel”, under rotation about the axis? For example, we wish the vectors illustrated below to get parallel-transported around the circle:

Take Minkowski spacetime in cylindrical coordinates , with metric , and consider a vector field whose components are independent of :

The covariant derivative in the tangential direction has components:

We want this to vanish, but first a quick recap (Lee  §4, 5). Recall a connection is defined by , in terms of our coordinate vector frame . This extends to a covariant derivative of arbitrary vectors and tensors, also denoted “”. The derivative of above assumed the Levi–Civita connection, which is inherited from the metric: it is the unique symmetric, metric-compatible connection. In that case the set of are called Christoffel symbols, but in general they are called connection coefficients.

The offending Christoffel symbols which prevent our vector field from being parallel-transported are: and . But we are free to simply define a new connection for which these vanish: ! Given a frame, any set of smooth functions yields a valid connection (Lee, Lemma 4.10). It is natural to hold on to the other Christoffel symbols, to accord whatever respect remains for the metric. In fact only one is nonzero, . To set this to zero would essentially deny the increase in circumference with the radius. Incidentally, even with keeping , the new connection is flat, meaning its associated Riemann curvature tensor vanishes.

The new connection may be expressed as the Levi–Civita one with a bilinear correction:

where and are arbitrary vectors, to be substituted into the 2nd and 3rd slots respectively of the (1,2)-tensor in parentheses. This is much simpler than it looks, as the terms just pick out and -components, and return basis vectors. Equivalently, the correction may be written , where the angle brackets mean contraction of a 1-form and vector. Notice from here and the two “offending” Christoffel symbols mentioned earlier, that only (the component of) the derivative in the –direction is affected.

These expressions obscure some beautiful symmetry. Let’s raise one index and lower another, in the correction term:

Here is a wedge product, is just the 1-form with components , and “” is a contraction. The correction’s components are simply . This is a vector, even though some lowered indices appear in the expression. The correction is just a rotation in the –plane! From inspection of the diagram, this is unsurprising.

This is analogous to Fermi–Walker transport. Given a worldline, this corrects the (Levi–Civita connection) time-derivative by a rotation in the plane spanned by and the 4-acceleration vector . Under Fermi–Walker transport, orthonormal frames stay orthonormal over time, and their orientation agrees with gyroscopes. For both our connection and the Fermi-Walker derivative, there is a preferred differentiation direction, along which a rotation is added to the Levi-Civita derivative.

I previously wrote about a connection for a spherically symmetric vector field. This has been a good learning experience about connections other than Levi-Civita. Many of us completed general relativity courses in which the curvature quantities were merely formulae with no intuitive understanding. However questions from mathematicians like: “Which connection are you using?” prompted me to learn more. (At least I have never been asked which differential structure I am using, nor which point-set topology, which is fortunate for all involved. 🙂 ) There are various physically-motivated connections defined in research paper  §2. I intend to apply this to the rotating disc, and to an observer field in Schwarzschild spacetime. Also, I accidentally came across Rothman+ 2001  about parallel transport in Schwarzschild spacetime, with numerous followup papers by various authors. All of this struck me again with a sense of fascination about curvature: how rich and deep it is.

🡐 Spherical symmetry connection | Curvature | ⸻ 🠦

## Local Lorentz boost in coordinate-independent notation

The Lorentz boost between two reference frames can be expressed as a (1,1)-tensor , interpreted as an operator on vectors. Here we re-express this well known fact using a general, index-free, coordinate-independent, 4-vector notation, which is valid locally in curved spacetime.

Recall the prototypical Lorentz boost on Minkowski spacetime:

This is a boost in the -direction by speed or Lorentz factor . It maps an arbitrary vector to . Numerous authors generalise to arbitrary boost directions, such as Møller  §18; MTW  §2.9; or Tsamparlis  §1.7. This typically involves separate transformations of time and 3-dimensional space: and . The arrows signify 3-dimensional vectors, is the position in 3-space, and is the relative 3-velocity. The space part uses beautiful, coordinate-independent vector language. However the time part requires privileged coordinates adapted to the observers. We will derive a 4-vector analogue.

Consider two 4-velocity vectors and (located at the same point, if in curved spacetime). They are related by the Lorentz boost:

where , the unit vector points in the boost direction, and is the relative velocity. This is the 4-vector analogue of the familiar coordinate boost . Combined with the space boost given shortly, this forms a local Lorentz transformation. While the plus sign makes the above appear an inverse boost, this is only because vectors (as whole entities) transform inversely to coordinates. Rearranging:

This is the relative velocity of the observer as determined in ’s frame, as explained previously. It is equivalent to the introduced in the 3-dimensional spatial transformation, except now treated as a 4-vector. It is orthogonal to , with length . Conversely, the relative velocity of as determined in ’s frame is . Now, the vector analogue of the usual boosted spatial coordinate is . After multiplying by :

Hence the relative velocity vectors are boosted into one-another, aside from a minus sign (Jantzen+ 1992  §4). This generalises the Newtonian result . So we have the boost’s action on the orthogonal vectors and , plus it is the identity on the 2-dimensional spatial plane orthogonal to both, hence:

using and . It is a good exercise to check the contractions with , , or any orthogonal to both. In index-free notation,

The “flat” symbol just means: lower an index. Equivalently, in terms of the initial observer and boost velocity alone:

in which case the relative speed may be obtained from . This is equivalent to MTW’s Exercise 2.7 which uses Cartesian coordinates, after adjusting various minus signs because I use vectors not vector components. In terms of the 4-velocities alone, we have the curiously symmetric expression:

Formulae are useful machines, allowing you to blithely turn the handle to crank out a result. This contrasts with my usual emphasis on conceptual understanding, and drawing a picture (at least mentally). However Lorentz boosts have many counter-intuitive or seemingly paradoxical effects. It is easier to make a mistake if you reason from first-principles alone. Of course the algebra does originate from careful thinking about foundations, and having multiple approaches is a check of consistency.

Boosts are paramount for comparing physical quantities between frames. Some textbooks present the general Lorentz boost in Minkowski spacetime with Cartesian coordinates. Our abstract vector formulation allows direct application to local boosts in arbitrary spacetime, such as Kerr or FLRW, in any coordinate system. I don’t remember seeing the formulae here in the literature, but someone should have done it somewhere. The Jantzen+ paper was an inspiration, and the same authors define various further quantities (projections, in fact) in Bini+ 1995 .

🡐 relative velocity | relative kinematics | ⸻ 🠦

Last time we discussed the “spatial gradient” or “3-gradient”, and here we follow up with two examples. Recall from before that a scalar field has gradient , and the part of this which is orthogonal to an observer 4-velocity is, as a vector:

This direction has the greatest increase of , for any vector in ’s 3-space (that is, orthogonal to ), per length of the vector.

As an example, suppose the 4-gradient vector is a null, future-pointing vector. It can be decomposed , where , and is a unit spatial vector orthogonal to . Physically, this gradient may be interpreted as a null wave or photon, which the observer determines to have energy (or related quantity, such as frequency) , and to move in the spatial direction . The 3-gradient vector is , hence the direction of relative velocity also has the steepest increase of , within the observer’s 3-space.

Suppose now is a unit, timelike, future-pointing vector, so that we may interpret it as the 4-velocity of a second observer. Then , where is the Lorentz factor between the pair. But we also have the “relative velocity” decomposition , where is the relative velocity of as determined in ’s frame, as I discussed previously. Combining these, . Hence within the observer’s 3-space, again increases most sharply in the direction of the relative velocity.

The figure shows the single tangent space — think of this as the linearisation of what is happening locally over the manifold itself. The hyperplanes are numbered by , where only the differences between them are relevant, as an overall constant was not specified. Observe crosses four of them, spanning an interval , so is the negative of ’s proper time; see a previous post for more background. In both our examples, the scalar decreases towards the future (or can vanish in the null case), even though the gradient vectors are future-pointing. That is, the gradient vectors actually point “down” the slope! This quirk is due to our −+++ metric signature, and would apply to spacelike gradients if +−−− were used instead. This really hurt my brain, until I drew the diagram. 🙁

To construct it, consider the action of on the axes. The horizontal axis is the relative velocity direction, with unit vector . One can show . Also , but I find it easier to think of: . These give the number of hyperplanes crossed by the unit axes vectors, then you can literally “connect the dots” since the 1-form is linear. In the figure , so . (As for the 3-gradient, it vanishes in the direction, hence must cross no contours of . It would be drawn as vertical lines, with corresponding vector pointing to the right.)

Most of our discussion applies to arbitrary 1-forms, not just gradients which are termed exact 1-forms. I derived the work here independently, but the literature contains some similar material. It turns out Jantzen, Carini & Bini 1992  §2 explicitly define the “spatial gradient”, as they most appropriately call it. A few textbooks discuss scalar waves, for which the 3-gradient vector is the wave 3-vector, which is orthogonal to the wavefronts within a given frame, as discussed shortly.

## Spatial gradient of a scalar

Suppose you have a scalar field , and at a given point in spacetime: a 4-velocity vector interpreted as an “observer”. In which direction does increase most steeply, when restricted to the observer’s local 3-dimensional space?

Last time I reviewed the gradient 1-form or covector , and its associated gradient vector obtained by raising the index as usual. The gradient vector has been described as the direction of greatest increase in per unit length (Schutz 2009  §3.3). However this is only guaranteed when the metric is positive definite, meaning a Riemannian manifold, rather than a Lorentzian manifold as used to model spacetime.

The observer’s 4-velocity splits vectors and 1-forms into purely “time” parts parallel to , and purely “space” parts orthogonal to it. (Intuitively, it may help to think of a basis adapted to the observer, meaning , and the vectors are orthogonal to , where . Then a purely spatial vector is spanned by the . Since vectors and covectors are linear, we need only specify their values on a basis set.)

Consider the tangent space at the specified point. Imagine working within the observer’s local 3-space, by which I mean the 3-dimensional subspace consisting of vectors orthogonal to . Label the gradient as restricted to this subspace by . On the subspace the metric has Riemannian signature, hence the corresponding vector is the direction of steepest increase. We can mimic this mathematically by staying in 4 dimensions, but setting the “time” part to zero:

This is a 4-dimensional object, but I reuse the notation “” to imply it vanishes in the observer’s time direction. This “3-gradient” is the projection of orthogonal to . The angle brackets signify contraction of the 1-form and vector, and the “flat” symbol denotes the 1-form obtained from by “lowering the index” using the metric. The vector 3-gradient is:

This follows from “raising the index” using the inverse metric as usual. Note that on the subspace, the inverse metric coincides with the inverse 3-metric which has components , for . Equivalently, one can apply the spatial projector to either or , with the same result. This projector agrees with the inverse metric on the 3-space, and is zero on purely timelike covectors. Either way, the essential part of the process is to remove the “time” component of the gradient. I will give examples in the following post.

Suppose a scalar field is defined on some region of spacetime. Its gradient expresses the change in (that is, its derivative) in each direction. In a coordinate system, it has components:

is a 1-form or covector. [Recall a 1-form is just a (0,1)-tensor. Schutz 2009  also uses the term dual vector, though I find this can lead to clumsy wording, such as the hypothetical phrase: “the vector [which is] dual to a dual vector”. Traditionally the term covariant vector has been used, meaning its components transform “covariantly” with a change of basis. 1-forms are a rigorous version of differentials, superceding the older idea of infinitesimals but using similar notation (Schutz 1980  §2.19; Spivak vol. 1  §4).] Above, “” is called the exterior derivative, and is the covariant derivative, but when acting on a scalar these coincide. Recall a 1-form accepts a vector and returns a number. In this case, the vector is the direction of differentiation, and the output is the derivative of in that direction (where the vector’s magnitude matters also).

The 1-form may be visualised as a set of hypersurfaces or level sets , on the manifold (MTW  §2.5–2.7, Box 4.4; Schutz 2009 §3.3). Ideally these could be spaced at intervals . Given some vector , the contraction:

is visualised as the number of hypersurfaces the vector pierces, or “bongs of [a] bell” in MTW’s colourful terminology. Technically however, vectors and 1-forms exist in the (co-)tangent spaces, not extended along the manifold. At any given point, is the linear approximation to , ignoring the constant term (MTW §2.6). Hence is more accurately visualised as hyperplanes within the tangent space there. The diagram below shows both artistic choices. Note in two dimensions, hypersurfaces and hyperplanes are just curves and straight lines, respectively.

The gradient vector is the dual to , with components obtained by raising the index in the usual way: . This may be elegantly written , where the “sharp” symbol is part of the “musical isomorphism” notation. While the gradient is usually first encountered as a vector, it is most naturally a 1-form, as this does not require a metric (MTW §9.4). As Schutz 2009 §3.3 explains:

…we in general cannot call a gradient a vector. We would like to identify the vector gradient as that vector pointing ‘up’ the slope, i.e. in such a way that it crosses the greatest number of contours per unit length. The key phrase is ‘per unit length’. If there is a metric, a measure of distance in the space, then a vector can be associated with a gradient. But the metric must intervene here in order to produce a vector. Geometrically, on its own, the gradient is a one-form.

But if one does not know how to compare the lengths of vectors that point in different directions, one cannot define a direction of steepest ascent…

The last line is from Schutz 1980 §2.19, where the discussion is similar. These textbooks give a superb introductory account of 1-forms, however the steepness comments are only valid for a Riemannian metric, with positive-definite signature. Consider Minkowski spacetime with coordinates . By linearity, we need only consider unit vectors. The 1-form has components , with just . These contract to give unity. If we restrict to vectors spanned by , and , Schutz’ steepness comments apply. However is also a unit spacelike vector, where , but combines with to give , hence crosses more intervals than the gradient vector does. Similarly for , the contraction with returns , but with the unit timelike vector yields . Hence for a timelike 1-form, its gradient vector crosses the least contours (taking the absolute value) per unit length, compared to other timelike vectors only. For a null 1-form, its gradient vector lies along the hyperplanes, so crosses zero of them (MTW Figure 2.7)!

Instead, we are left with saying the gradient vector is orthogonal to all vectors on which the 1-form vanishes: whenever . The angle brackets mean contraction using the metric, with indices appropriately raised or lowered. Another property is the gradient vector’s squared-norm equals the 1-form’s squared-norm, which also matches the number of contours crossed:

The above statements are basically tautologies, but they help clarify what metric duality means. Incidentally, not all 1-forms arise as the “” of a scalar, but only those termed exact (Wald  §B1). Most of this post applies also to arbitrary 1-forms , for which the hyperplanes are spanned by vectors satisfying . For many creative illustrations see MTW, including their “honeycomb” and “egg crate” analogies for 2-forms and 3-forms, and their Figure 4.5 for the 2-form . Finally, I previously reviewed contractions like , which give the rate of change of the scalar by proper time along a worldline.