Relative velocity in general relativity

Difficulty level:   ★ ★ ★

Suppose we have two 4-velocity vectors \mathbf u and \mathbf v at the same point in curved spacetime. (This avoids complications such as parallel transport. Intuitively, think of the two objects as not necessarily overlapping, but close enough that we can neglect curvature etc.)

Consider firstly inertial frames in Minkowski spacetime. Using coordinates (t,x,y,z) corresponding to some observer \mathbf v, the components of a different observer \mathbf u satisfy:

    \[u^\mu = \frac{dx^\mu}{d\tau} = \frac{dt}{d\tau}\frac{dx^\mu}{dt} = \gamma(1,\beta_x,\beta_y,\beta_z).\]

Here \tau is \mathbf u‘s proper time, \gamma := -\mathbf u\cdot\mathbf v := -g_{\mu\nu}u^\mu v^\nu is the Lorentz factor as I have discussed previously, and the \beta_i are the relative speeds in the coordinate directions. This calculation is inspired by Tsamparlis 2010  §6.2.

[The expression dt/d\tau = \gamma bothered me, because time-dilation is mutual, so one might argue a case for \gamma^{-1} instead. But the key point is, the derivative occurs along the direction of \mathbf u. Another way to check the expression is to write dt/d\tau = dt(\mathbf u) = -\mathbf v^\flat(\mathbf u) = \gamma. This is a contraction of the 1-form dt with the vector \mathbf u, as I explained previously. The “flat” symbol just means dt is the 1-form dual to -\mathbf v. Conversely, along \mathbf v we have d\tau/dt = -\mathbf u^\flat(\mathbf v) = \gamma, which is not a contradiction!]

With a view to generalisation, we re-express the earlier displayed formula using vectors in place of coordinate components: \mathbf u = \gamma(\mathbf v+\mathbf u_\textrm{rel}). This is also more elegant. The reader may find better notation than \mathbf u_\textrm{rel}, but this is the relative velocity of \mathbf u from \mathbf v‘s frame. Rearranging,

    \[\boxed{\mathbf u_\textrm{rel} = \gamma^{-1}\mathbf u - \mathbf v.}\]

This vector lies in the local 3-space of \mathbf v, since \mathbf v\cdot\mathbf u_\textrm{rel} = 0, so in particular \mathbf u_\textrm{rel} is spatial. It has length \beta, which is the overall relative speed, and satisfies \gamma = (1-\beta^2)^{-1/2}. If you want, there is also a decomposition \mathbf u_\textrm{rel} = \beta\hat{\mathbf n}, where \hat{\mathbf n} is a unit vector. Conversely, the relative velocity of \mathbf v with respect to \mathbf u is \mathbf v_\textrm{rel} = \gamma^{-1}\mathbf v - \mathbf u. This also has length \beta, but lies in \mathbf u‘s 3-space. However, unlike the Newtonian case, \mathbf u_\textrm{rel} \ne -\mathbf v_\textrm{rel}, unless \mathbf u = \mathbf v. See Tsamparlis (§6.4) for discussion.

All the vector formulae above transfer unchanged to curved spacetime, for 4-velocities at the same event, including worldlines with acceleration. This can be justified using local inertial coordinates. While the formulae do appear in the literature, with one example being Bini 2014  §6, the topic of observer measurements in general is not widely promoted. I recall two separate conversations with senior relativists who were unfamiliar with use of the Lorentz factor in a curved spacetime context.

For comparison, one quantity which should not be naively ported across from special relativity is acceleration. In curved spacetime, the 4-acceleration \nabla_{\mathbf u}\mathbf u requires the covariant derivative, which depends on curvature (and possibly other choices). The reader curious about relative acceleration could try Jantzen, Carini & Bini: their 1995 paper , and an unfinished book  last updated in 2013.

Derivative as contraction of a 1-form and vector

Difficulty level:   ★ ★ ★

Suppose you seek the derivative of a quantity along a curve, such as the rate of change of a scalar by proper time along a worldline: d\Phi/d\tau, or perhaps the rate of change of pressure by proper distance along a given spatial direction: dp/ds. These derivatives are conveniently expressed as a contraction between a 1-form (the gradient of the scalar) and a tangent vector to the curve. For the first example,

    \[\frac{d\Phi}{d\tau} = \frac{\partial\Phi}{\partial x^\mu} \frac{dx^\mu}{d\tau} = (d\Phi)_\mu u^\mu = d\Phi(\mathbf u),\]

where (x^\mu) is some coordinate system, d\Phi is the 1-form with components (d\Phi)_\mu = \Phi_{,\mu} = \partial\Phi/\partial x^\mu, and u^\mu = dx^\mu/d\tau is the 4-velocity. d\Phi(\mathbf u) is the contraction of the vector and 1-form, yielding a scalar. Schutz 2009  §3.3 gives this derivation.

A spacelike path can be parametrised by proper distance. Then d\Phi/ds = d\Phi(\boldsymbol\xi), where \xi^\mu := dx^\mu/ds is the unit tangent vector. An example of a paper which uses this is Gibbons 1972 , for the change in stress along a rigid cable, see the line after his Equation 4.

For a null path there is no natural parameter, at least not without additional context. But for any chosen parameter \lambda, we have d\Phi/d\lambda = d\Phi(\boldsymbol\xi) as before, where \xi^\mu := dx^\mu/d\lambda is the tangent vector. Of course this applies to the other cases as well. Note all these calculations occur within a single tangent space.

Kinematic decomposition: expansion + shear + vorticity

Difficulty level:   ★ ★ ★
worldlines showing expansion, shear, and vorticity
Worldlines which collectively exhibit expansion, shear, and vorticity.

Suppose you know the motion of some particles / fluid / observers over time, as in the diagram. At each point the gradient of the motion can be decomposed into: expansion, shear, and vorticity. This is known as the kinematic decomposition, and is a very important tool in relativity.

Write \mathbf u for the 4-velocity field, then lower its index and take the covariant derivative: \nabla\mathbf u^\flat (that’s a “flat” symbol not the letter b), which is u_{a;b} or \nabla_b u_a in coordinate notation. This (0,2)-tensor is the gradient of the motion. Now apply the spatial projection tensor h_{ab} := g_{ab}+u_a u_b to get the purely spatial part (\mathbf B say) of the velocity gradient, meaning the part orthogonal to \mathbf u:

    \[B_{ab} := h^c_{\hphantom c a} h^d_{\hphantom d b} u_{c;d} = u_{a;b} + \dot u_a u_b.\]

Here \dot u_a is the (dual) 4-acceleration \nabla_{\mathbf u}\mathbf u^\flat, or u_{a;b}u^b in coordinate notation. The latter identity displayed above follows from substituting \mathbf u into the second slot of \nabla_{\mathbf u}\mathbf u^\flat: evaluate u_{c;d}u^d, which you should recognise. Now the symmetric part of \mathbf B is the expansion tensor \theta_{ab} = \frac{1}{2}(B_{ab}+B_{ba}) =: B_{(ab)}, and the antisymmetric part is the vorticity tensor \omega_{ab} = \frac{1}{2}(B_{ab}-B_{ba}) =: B_{[ab]}. (Note some use the opposite sign convention for \omega_{ab}.) The expansion tensor itself splits into “trace” and “trace-free” parts: \theta_{ab} = \frac{1}{3}\theta h_{ab} + \sigma_{ab}. Here \theta = g^{ab}\theta_{ab} = u^a_{\hphantom{a};a} is the expansion scalar; it is the trace of the expansion tensor, and the divergence of the 4-velocity field. \sigma_{ab} is the shear tensor. There are alternative formulae but this approach, which follows Ellis 1971 , seems most efficient for computer algebra. In summary, the kinematic decomposition is:

    \[u_{a;b} = \frac{1}{3}\theta h_{ab}+\sigma_{ab}+\omega_{ab}-\dot u_a u_b.\]

So what is the physical meaning of the quantities? Expansion means the particles move apart over time, or contract in the case of negative expansion. More precisely it is the proportional expansion per unit time. (In this article all quantities are understood as measured in the fluid’s frame, in particular “time” means the proper time along the worldline(s).) The expansion scalar gives the proportional change in volume over time: \theta = V^{-1}\,dV/d\tau. A familiar example is the Lemaître-Hubble parameter H = \theta/3, but in an arbitrary context expansion is both position and direction-dependent. Shear (by itself) involves expansion in some directions but contraction in others. Again, this is a proportional change over time. Shear by itself does not change the volume. The eigenvectors of the shear tensor are the principal axes of shear, and since \sigma_{ab} is real and symmetric one can find an orthogonal basis of eigenvectors. Some potentially misleading language is that the expansion tensor also includes the shear; one can emphasise the isotropic (component of) expansion to distinguish \frac{1}{3}\theta h_{ab} specifically. Finally, vorticity is microscopic rotation, known as curl in 3-dimensions. These can be distinct from macroscopic rotation, as another website nicely visualises. Vorticity by itself is rigid, so does not change lengths or volume.

Define also the shear and vorticity scalars \sigma^2 = \frac{1}{2}\sigma_{ab}\sigma^{ab} and \omega^2 = \frac{1}{2}\omega_{ab}\omega^{ab}. These are positive-definite measures: \sigma \ge 0 with \sigma = 0 if and only if \sigma_{ab} = 0, and similarly for \omega. There is also a vorticity vector \omega^a which is the axis of local rotation. Much more could be said. There are formulae giving the rates of change of relative distance and direction from a given vantage point, see Ehlers  for instance. There are elegant formulae using the exterior derivative, see e.g. Jantzen, Carini & Bini  2013 draft, §2.2.3. Note we have only described the kinematics and said nothing of its causes, in particular the Einstein field equations are not assumed.

As for literature, the best textbook presentations I am aware of are Ellis, Maartens & MacCallum 2012  §4.6; and Poisson 2004  §2.3. Two classic papers are Ehlers 1961  §2.1 and Ellis 1971  §2. Translators of Ehlers (1993 ) described it as an “outstanding review paper” and that “[d]espite its age, it remains one of the best reviews available in this area.” Ellis was republished in 2009 , along with an editorial note  which reviews applications, and states “[f]ew papers in relativistic cosmology have been as influential and as frequently cited”, despite being “primarily a synthesis… of earlier results”. Newtonian fluid dynamics has a similar decomposition of the velocity gradient \partial_i v_j, see perhaps Ellis or Poisson §2.2. Wainwright & Ellis, eds., 1997  is one source which gives further applications. I have assumed timelike worldlines, but Poisson §2.4, 2.6 treats the null case, for which only some of the kinematic quantities remain unambiguous. Everything I have said here assumes the fluid / particles frame of reference, but Larena & coauthors 2011  §2.1 investigate other frames; see also Jantzen+ §2.4.8, 3.3.5.

Cosmic Cable poster

Difficulty level:   ★ ★ ★

Here is my recently completed “Cosmic Cable” poster. You can also download a PDF version. Its first appearance is at the GR22 conference in Valencia, Spain, this week.

This work investigates the mechanics of a rigid 1-dimensional object in an arbitrary static, spherically symmetric spacetime. Others have applied such ropes / cables / strings to thermodynamics of black holes (for example scooping up Hawking radiation in a box), or to harvest energy in an expanding universe. It is an interesting exercise in topics which don’t receive a lot of attention, such as extended rigid objects in relativity.

I review the case of static cables, which show a fascinating “redshift” of force effect. I then generalise to a simple case of moving cables, solve for the kinematics, tension, and also the power than can be generated (loosely, this is from a loss of gravitational potential). In case this stuff sounds simple, it is not at all obvious, indeed many papers fail already at the kinematics step. The frame dependence of the quantities is a conceptual challenge.

If you’d like more details I have a proceedings (forthcoming) from the Marcel Grossmann conference of 2018. I will expand this into a longer paper. I would also like to write a pedagogical paper explaining Gibbons (1972)  — his 1-page paper is beautifully concise, yet is hard to understand, has an error (as others have pointed out), and lots of typos.

Difference between special and general relativity?

Difficulty level:   ★ ★ ★

What is the distinction between special relativity (SR) and general relativity (GR)?

It is sometimes said SR can only handle inertial frames, but enough commentators call this a misconception that I must go along with them. A pedagogical paper  on the arXiv today is one example. Also Carroll (2004, §1.2)  writes,

The notion of acceleration in special relativity has a bad reputation, for no good reason. Of course we were careful, in setting up inertial coordinates, to make sure that particles at rest in such coordinates are unaccelerated. However, once we’ve set up such coordinates, we are free to consider any sort of trajectories for physical particles, whether accelerated or not.

This seems a good definition to me: SR is the use of Minkowski coordinates in Minkowski spacetime. You can describe acceleration, but only from within an inertial frame. For example the classic SR textbook Taylor & Wheeler (1992, §2.4)  states, “special relativity is limited to free-float frames”. But from within such frames, they do analyse accelerating particles, see e.g. §3.2. Similarly Misner, Thorne & Wheeler (1973)  even title their section §6.1, “Accelerated observers can be analyzed using special relativity”.

Another definition could be: SR is what you learn in an SR course. In high school I learned about the Lorentz factor, Lorentz transformations, length-contraction, time-dilation, and composition of boosts in the same spatial direction. Undergraduate SR courses presumably have more content, but the term “SR” would still exclude more advanced material, such as Christoffel symbols perhaps, under this definition.

However some textbooks disagree. Misner, Thorne & Wheeler have a solid presentation of 1-forms (§2), and include Fermi-Walker transported tetrads (§6), both in an “SR” context. Gourgoulhon (2013)  takes it to a whole other level, including self-described “rather advanced topics”. He allows not only arbitrary coordinates but non-coordinate bases (§15.4.3), after all the textbook is titled, “Special relativity in general frames”. Gourgoulhon discusses the stress-energy tensor (§19), relativistic hydrodynamics (§21), and even gravitation via historical scalar field theories on flat spacetime (§22). (Of course the stress-energy tensor doesn’t couple to spacetime curvature in this context, so the Einstein field equations are not satisfied.) Personally I would call all this “Minkowski spacetime” rather than “special relativity”! Then again, it could be a publisher’s decision for marketing purposes.

Finally, another definition of SR would be historical, limited to the scope of early papers including Einstein (1905)  and by Minkowski.

In conclusion, I am happy with the definition of SR as Minkowski spacetime using only global inertial frames. Minkowski coordinates would certainly included, with the metric -dt^2+dx^2+dy^2+dz^2, and even simple alternatives such as spherical coordinates -dt^2+dr^2+r^2(d\theta^2+\sin^2\theta\,d\phi^2), so long as covariant derivatives are not required for a given context. Another time I will discuss the application of SR results in global inertial frames to local orthonormal frames in GR.

Forgotten mechanics from the 1800s

Difficulty level:   ★ ★ ★

Are there historical areas of physics we have forgotten about? I have been reading a little of Klein & Sommerfeld’s The theory of the top (volume one, 1897). The spinning top might seem just a cute problem. However their work forms a detailed 4-volume set, which took over a decade to complete, and Felix Klein was a leading mathematician. As the translators of a recent English edition (Nagem & Sandri, 2008 ) point out, the book contains one of the earliest occurrences of spinors, applied to the instantaneous position of the top (see #31 of their Translator’s Notes). Also I can’t help but share a quote from Herschel (1851 ), who found a spinning top the best demonstration of the precession of Earth’s rotational axis. This child’s toy:

…becomes an elegant philosophical instrument, and exhibits, in the most beautiful manner, the whole phenomenon.

Nagem & Sandri comment, in #57 of their Translator’s Notes:

It was a great surprise for the translators to find that so many prominent nineteenth-century mathematicians devoted their attention to the statics of rigid bodies, and developed it to such an extent. The subject is now neglected entirely in physics and mathematics, and covered only superficially in engineering curricula.

Two sources they mention are Möbius’ Lehrbuch der statik (“Statics textbook”, 1837 ) which analyses forces on rigid bodies, and Ball’s The theory of screws (1876 ).

You could emphasise that modern physics (relativity and quantum) is revolutionary, and of course that is true. But I prefer to emphasise continuity with earlier physics. In the present we make constant usage of Lagrangian and Hamiltonian mechanics, even though these are approximately 200 years old. The enduring relevance of Newtonian mechanics should need no introduction. Also I was a little amused to see Archimedes’ principle mentioned in a modern quantum + relativistic context: Unruh & Wald (1982)  discuss lowering a box, which contains thermal radiation, on a rope towards a black hole horizon:

The energy delivered to the black hole is minimized when the box is dropped from its “equilibrium point,” i.e., when the tension in the rope is zero. By the Archimedes principle… this occurs when the energy of the box equals the energy of the displaced acceleration radiation.

This buoyancy effect is due to Hawking radiation. This finally resolved a paradox by Bekenstein (1972) , who proposed using a black hole to seemingly convert energy into work with 100% efficiency, which would violate the 2nd law of thermodynamics. For a historical overview, see Israel (1987, §7.10).

So, good work Archimedes! Two millennia and going strong. But which research have we forgotten today?

Hypersurfaces of constant proper time

Difficulty level:   ★ ★ ★

Suppose a given spacetime has a region filled with worldlines (a timelike congruence), and a foliation defined by hypersurfaces of constant proper time along these worldlines. As a boundary condition, all times can be set to zero on a given initial hypersurface. The question is, will the proper time hypersurfaces remain spacelike? I investigate this for two straightforward examples: static observers in Schwarzschild spacetime, and the rotating disc in Minkowski spacetime.

George Ellis mentions the possibility of the hypersurfaces becoming timelike, in a 2014 paper  on his “evolving block universe” interpretation. The context is cosmology, and the worldlines are (in principle) some coarse-grained flow of matter:

The flow lines are not necessarily orthogonal to the surfaces of constant time. This does not matter: no physical phenomena are directly determined by simultaneity in the usual sense. More than that, the surfaces determined in this way are not even necessarily space-like in an inhomogeneous spacetime. In that case, the implied initial value problem will locally be time-like, and the way it works will need to be rethought.

Perhaps the possibility of proper time hypersurfaces becoming timelike has not been investigated in detail. Presumably Ellis’ superb earlier publication Relativistic Cosmology (2012), coauthored with Maartens and MacCallum, would not discuss it either.

Recall a congruence is proper time synchronisable if and only if it is geodesic and vorticity-free, by Frobenius’ theorem. If so, the gradient of a proper time coordinate T say, is given by the dual vector to the 4-velocity:

    \[dT := -\mathbf u^\flat\]

where the minus sign compensates for the metric signature choice -+++. Now consider the congruence of static observers in Schwarzschild spacetime. These have 4-velocity parallel to the Killing vector field which is timelike at infinity, hence are defined for all r>2M. The dual-velocity is

    \[-\mathbf u^\flat = \sqrt{1-\frac{2M}{r}}\,dt\]

in terms of the Schwarzschild t-coordinate. This is clearly not integrable, as expected because the static observers are accelerating. But it still suggests the proper time coordinate T := t\sqrt{1-2M/r}. Then the gradient covector is

    \[dT = \sqrt{1-\frac{2M}{r}}\,dt + \frac{Mt}{r^2}\Big(1-\frac{2M}{r}\Big)^{-1/2}dr\]

In general this is not orthogonal to the worldlines, and one interpretation is a non-standard simultaneity convention, as discussed in my forthcoming paper  “Time, black holes, and infinity”. However it is still proper time, because dT(\mathbf u) = 1. Using the inverse metric,

    \[dT\cdot dT = -1+\frac{M^2t^2}{r^4}\]

so dT is timelike for |t|/M < (r/M)^2, and since dT is a normal to the hypersurfaces they are spacelike in the same region. The figure below shows three examples on a Penrose diagram. The hypersurfaces are spacelike for sufficiently large r, but become null at |t|/M = (r/M)^2 which is the dotted red line in the diagram below.

Hypersurfaces of constant proper time for static observers in Schwarzschild spacetime
Hypersurfaces of constant proper time for static observers in Schwarzschild spacetime

This makes sense intuitively. Near the horizon, the static observers are heavily gravitationally time-dilated, so for a proper time of e.g. T = 1 occurs well into the “future”. This is seen from the curves bending upwards in the diagram for t>0, and bending downwards for t<0, near the horizon. The claim of being in the “future” has some dependence on one’s choice of simultaneity convention, however once the red line is crossed it is an unambiguous statement because the T = \textrm{const} events are timelike separated. Incidentally T=0 at t=0, but this is just an initial condition, and in general one could define T := t\sqrt{1-2M/r} + h(r,\theta,\phi) for any function h, which is also proper time along the worldlines.

Now consider a rigidly rotating disc in 2+1-dimensional Minkowski spacetime. Using polar coordinates (t,r,\phi), the “4”-velocity of each particle on the disc is

    \[u^\mu = \Big(\frac{1}{\sqrt{1-r^2\Omega^2}},0,\frac{\Omega}{\sqrt{1-r^2\Omega^2}}\Big)\]

where \Omega := d\phi/dt \ge 0 parametrises rotation speed. The previous procedure of deriving a time coordinate wasn’t fully general. Here we expect a proper time coordinate to depend on r and t but not \phi. The proper time runs more slowly (compared to t) towards the edge of the disc, note the disc is bounded by r < 1/\Omega for timelike motion. Also it is well known there is a “time-lag” when trying to define simultaneity around a circle r=\textrm{const}. However one can use non-standard simultaneity (i.e. constant “time” hypersurfaces not orthogonal to the worldlines) to avoid this problem: see Relativity in Rotating Frames (2004), particularly the chapter by Rizzi & Serafini.

Based on u^t \equiv dt/d\tau above, define

    \[\bar t := t\sqrt{1-r^2\Omega^2}\]

This deliberately avoids any angular dependence. The gradient is

    \[d\bar t = \sqrt{1-r^2\Omega^2}\,dt -\frac{\Omega^2 tr}{\sqrt{1-r^2\Omega^2}}\]

One can check d\bar t(\mathbf u) = 1, so this is a proper time coordinate. From the above expression one can quantify an implied non-standard simultaneity convention, but I will avoid this here. The hypersurfaces turn null at |t| = (1-r^2\Omega^2)/r\Omega^2.

The spacetime diagram below represents hypersurfaces of constant proper time \bar t, and is independent of rotation rate due to scaling of the coordinates. The dotted red line is where the hypersurfaces are null; to the left of it they are spacelike.

Hypersurfaces of constant proper time for particles on a rotating disc in Minkowski spacetime
Hypersurfaces of constant proper time for particles on a rotating disc in Minkowski spacetime

As \Omega r\rightarrow 1 the hypersurfaces quickly turn null. For small \Omega r, they turn null at \Omega|t|\approx 1/r\Omega. Thus for small rotation / acceleration, the desynchronisation is slow but cumulative.

The disc particles are accelerated, so for variety let’s choose an example with vorticity but no acceleration. Take Schwarzschild spacetime, with circular orbits on the coordinate equator \theta = \pi/2. These are valid anywhere outside the photon sphere at r = 3M, not merely the ISCO at r = 6M. The 4-velocity is:

    \[u^\mu = \Bigg( \frac{1}{\sqrt{1-\frac{3M}{r}}},0,0,\frac{\sqrt{\frac{M}{r^3}}}{\sqrt{1-\frac{3M}{r}}} \Bigg)\]

in Schwarzschild cordinates, which suggests a new coordinate \tilde t := t\sqrt{1-3M/r}. This is null at

    \[\frac{|t|}{M} = \frac{2}{3}\Big(\frac{r}{M}\Big)^2 \frac{1-\frac{3M}{r}}{1-\frac{2M}{r}}\]

which occurs instantly in the limit r\rightarrow 3M, and slowly for r\gg 3M.

In conclusion, proper time hypersurfaces can become timelike:

  • quickly, for high acceleration of the worldlines
  • quickly, for high vorticity of the worldlines
  • slowly, for mild but sustained acceleration or vorticity, a cumulative effect

This investigation was sparked by a lunchtime conversation with Pierre Mourier and Prof. David Wiltshire today, at the University of Canterbury in Christchurch, New Zealand. My forthcoming paper “Time, black holes, and infinity” research paper  will discuss simultaneity in Schwarzschild spacetime.

Update (next day): Mourier clarifies that proper time hypersurfaces have been studied, but often in the zero-vorticity case. Hence they remain orthogonal to the worldlines (in a cosmological context it is taken for granted that the worldlines are geodesics). So the issue of turning timelike does not come up. See perhaps §6.6.1 of Relativistic Cosmology as cited above, or look up “synchronous coordinates”. Mourier has also looked at rotation in Minkowski spacetime, different from the rigid rotation example above, and found similar effects. Wiltshire comments he and collaborators have looked at rotation effects in Lemaître-Tolman-Bondi models and van Stockum dust.

Duality of a basis is not duality of individual vectors

Difficulty level:   ★ ★ ★

Suppose we have a coordinate system (x^\alpha). This defines a coordinate basis (\partial_\alpha), where for each \alpha the basis vector has components

    \[(\partial_\alpha)^\mu = \delta_\alpha^\mu\]

in these coordinates. We also have the coordinate dual basis (dx^\beta) where each dual vector or “1-form” has components

    \[(dx^\beta)_\mu = \delta^\beta_\mu.\]

Now while these bases are dual in the sense of bases:

    \[dx^\beta(\partial_\alpha) = \delta^\beta_\alpha\]

(by definition of dual basis), the individual vectors are not dual to the individual 1-forms in the sense of individual vectors. That is, for any given \alpha, we have \partial_\alpha and dx^\alpha are not dual in general.

Instead, recall indices are raised and lowered using the metric components (in a coordinate basis, that is). Possibly the result could be seen by inspection, but for clarity let’s write \mathbf e := \partial_\alpha for some chosen \alpha. This vector has components e^\nu = \delta^\nu_\alpha, hence the corresponding 1-form has components e_\mu = g_{\mu\nu}e^\nu = g_{\mu\nu}\delta^\nu_\alpha = g_{\mu\alpha}. By the meaning of components this says the 1-form is g_{\mu\alpha}dx^\mu. This is not dx^\alpha, in general! In “musical isomorphism” notation, the result is:

    \[(\partial_\alpha)^\flat = g_{\mu\alpha}dx^\mu\]


    \[(dx^\alpha)^\sharp = g^{\mu\alpha}\partial_\mu.\]

To show the result another way, recall the metric defines the dual to our vector \partial_\alpha to be \mathbf g(\partial_\alpha,\cdot). To examine this 1-form, feed it a vector (specifically, basis vectors \partial_\mu) and see how it acts on it:

    \[(\partial_\alpha)^\flat(\partial_\mu) := g(\partial_\alpha,\partial_\mu) = g_{\alpha\mu},\]

which says (\partial_\alpha)^\flat = g_{\alpha\mu}dx^\mu, as before.

In closing, another reason we cannot have (\partial_\alpha)^\flat = dx^\alpha in general is that the coordinate basis vector \partial_\alpha is not defined in terms of x^\alpha alone, but also all the other coordinates chosen. More on that next.

(Schutz (2009, §3.3, §3.5) makes a superb background to this discussion, and while the cited sections are for special relativity, in this case you can simply replace the Minkowski metric \boldsymbol\eta with an arbitrary curved metric \mathbf g.)

Mimicking a black hole in flat spacetime

Difficulty level:   ★ ★ ★

For a Schwarzschild-Droste black hole, the curvature of 3-dimensional space is often depicted as a funnel shape (Flamm 1916 ). As I emphasise in forthcoming papers, this assumes the static slicing of spacetime, whereas other slicings yield different embedding diagrams. This leads to the question, could we slice flat spacetime in such a way that we get a similar funnel, or mimic other properties of a black hole? While this cannot of course change the fact the 4-dimensional spacetime is flat, the point is there is much flexibility in defining the 3-space, because it depends only on the chosen slicing or observers.

Embedding diagram for a fake black hole, representing an unusual spatial slice of Minkowski spacetime
Spoiler: Yes you can! This is an embedding diagram for our “fake black hole”, representing an unusual spatial slice of Minkowski spacetime. This looks more like a spinning top than a funnel.

Let’s start with Minkowski spacetime in spherical coordinates:

    \[ds^2 = -dt^2 + dr^2 + r^2(d\theta^2+\sin^2\theta\,d\phi^2)\]

This defines an inertial frame. Now suppose spacetime is filled with test particles moving radially, relative to the coordinate origin. Take coordinate speed dr/d\tau = \pm\sqrt{2M/r}, by analogy with the Schwarzschild and even Newtonian cases (choose one sign and stick with it). The 4-velocity is then:

    \[u^\mu = \bigg(\sqrt{1+\frac{2M}{r}},\pm\sqrt\frac{2M}{r},0,0\bigg)\]

which follows from normalisation \mathbf u\cdot\mathbf u=-1. Next we define a new time coordinate. A natural first attempt is to try the proper time of the particles. This may be obtained via local Lorentz boosts, or equivalently by a neat trick of lowering the index on the 4-velocity vector then taking its negative:

    \[-\mathbf u^\flat = \sqrt{1+\frac{2M}{r}}dt \mp\sqrt\frac{2M}{r}dr\]

(I explain this approach in a forthcoming paper, but it is inspired by Martel & Poisson 2001  and ultimately based on Frobenius’ theorem: see the variant for 1-forms described in de Felice & Clarke §2.12.) Expressing the dual velocity this way, as an explicit sum of the coordinate dual basis vectors dt and dr, is suggestive of a total differential which we would hope is the proper time d\tau. Unfortunately the expression is not a total differential, as seen by examining the coefficient of dt. But from inspection we can use an integrating factor: divide through by \sqrt{1+2M/r}, simplify, and define the resulting expression as the differential of a new time coordinate T:

    \[dT := dt \mp\frac{1}{\sqrt{1+r/2M}}dr\]

(Incidentally, this easily integrates to T = t \mp 4M\sqrt{1+r/2M} plus a constant of integration.) While T is not the proper time, its level sets T=\textrm{const} coincide with the 3-space of the observers as shown next, which is sufficient for our embedding diagram. Since \mathbf u is by definition orthogonal to its local 3-space, the dual vector \mathbf u^\flat is also normal to this 3-space. But dT is parallel to \mathbf u^\flat, hence they are normal to the same 3-space, but any gradient dT is normal to the level sets T=\textrm{const}, which proves the claim.

This is analogous to static observers in Schwarzschild spacetime. While the Schwarzschild t-coordinate is not their proper time, setting t=\textrm{const} still determines the same 3-space as these observers. Also we cannot replace the t-coordinate with proper time while still retaining the coordinates r, \theta, and \phi. The derivative dT/d\tau = 1/\sqrt{1+2M/r} for our fake black hole is also reminiscent of static observers in Schwarzschild spacetime.

Rearrange the earlier expression for dT and substitute into the line element to obtain:

    \[ds^2 = -dT^2 \mp \frac{2}{\sqrt{1+r/2M}}dT\,dr + \bigg(1+\frac{2M}{r}\bigg)^{-1}dr^2\]

plus the 2-sphere metric r^2(d\theta^2+\sin^2\theta\,d\phi^2). These coordinates have no issue at r=2M, and while there is a coordinate singularity at r=0 the metric was degenerate there even in our initial spherical coordinates. The Riemann tensor is zero, as it must be since this is still flat spacetime. Since g^{TT}=-(1+2M/r)^{-1} the coordinate T is timelike everywhere. The 4-velocity in the new coordinates is u^\mu=(1/\sqrt{1+2M/r},\pm\sqrt{2M/r},0,0). Integrating dT/dr = u^T/u^r gives the travel time T(r) which is well behaved unlike Schwarzschild t(r) which diverges. The radial proper distance for our test particle observers is (1+2M/r)^{-1/2}dr, which gets very small for r\ll 2M compared to the inertial frame which measures radial distance dr everywhere.

A typical isometric embedding diagram for a spherically symmetric spacetime takes a slice of constant “time”, here T=\textrm{const}, through the equator \theta=\pi/2. This is matched isometrically with a surface z=z(r) in a 3-dimensional flat space. The flat space is taken to be Euclidean or Minkowski space, with the metric dr^2+r^2d\phi^2\pm dz^2 in cylindrical coordinates (the sign is unrelated to our previous sign choice). Our case requires the minus sign for Minkowski space since g_{rr}<1. It follows z = 4M\sqrt{1+r/2M}, which may be plotted in a scale-invariant way as z/M against r/M.

Embedding diagram for a fake black hole, an unusual choice of spatial slice in Minkowski spacetime
The same embedding diagram from a lower viewpoint and with further comments. The diagram is the same for both ingoing and outgoing particles / observers. This surface extends to the origin, unlike Flamm’s paraboloid for Schwarzschild space which corresponds to static observers and hence is only defined outside r=2M.

The particles must be accelerating, as their motion is not caused by gravity. In the new coordinates ingoing particles have 4-acceleration (0,-M/r^2,0,0), outgoing particles have a different expression, but both have magnitude a=(1+2M/r)^{-1/2}M/r^2. Again these expressions are reminiscent of static observers in Schwarzschild spacetime. Each particle has a “Rindler” horizon at distance 1/a as measured in the instantaneous comoving frame, so in the original inertial frame this is contracted by the Lorentz factor \gamma=\sqrt{1+2M/r} and occurs at position r\mp r^2/M (simultaneous in the instantaneous comoving frame).

The kinematic decomposition of the particle worldlines yields zero vorticity, which is fortunate because by Frobenius’ theorem this is the condition for the local 3-spaces to all patch together consistently. The expansion tensor, expressed in the frame of the particles (different frames for ingoing and outgoing), is \mp\sqrt{M/2r^3} in the radial direction, and \pm\sqrt{2M/r^3} in the tangential directions. The shear is twice this amount in the radial direction, and half this amount in the tangential directions.

In the new coordinates the lapse is \sqrt{1+2M/r} and the shift (\mp 2M/r\cdot\sqrt{1+r/2M},0,0). The extrinsic curvature (of the 3D spatial slices inside 4D Minkowski spacetime, not the 2D embedded slice) is \pm\sqrt{2M/r^3} times -dr^2/2(1+2M/r)+r^2(d\theta^2+\sin^2\theta\,d\phi^2). This has trace K = \pm\sqrt{M/2r^3}(1/(1+2M/r)-4) or \pm\sqrt{M/2r^3}(3+8M/r)/(1+2M/r).

Finally, Flamm’s paraboloid is an iconic image, and I defend visualisations and metaphors in general as helpful and intuitive. But one should understand the limitations, in contrast to Painlevé 1921  for example who found a slicing of Schwarzschild spacetime into Euclidean 3-spaces \mathbb R^3, but drew some overly zealous conclusions from this (thanks to Andrew Hamilton for discussion on this point). Admittedly the static slicing in Schwarzschild spacetime is a natural choice, while my “fake black hole” slicing is contrived. But still, the reproduction of a funnel-shaped embedding in flat spacetime shows the need for caution in interpreting Flamm’s paraboloid as gravity.

How to convert between frame and coordinate bases

Difficulty level:   ★ ★ ★

This article describes how to transform components of vectors or other tensors between a coordinate basis and an arbitrary frame / tetrad. This process is more general than the transformation between two coordinate bases as found in any introductory general relativity course. Some frames are “non-holonomic” meaning they do not arise from any set of coordinate basis vectors, also there may be situations in which a coordinate representation is inconvenient or not known. I also outline how to implement the transformations in a computer algebra system (CAS).

Effectively we only work in a single tangent space on the manifold, so it turns out to be just a linear algebra problem. My description is based on Carroll (§J) and de Felice & Clarke (§4.2) who assume the frame is orthonormal, however I simply assume it is a basis: that it spans the tangent space and is linearly independent. So suppose we have coordinates x^\mu, and a frame (\mathbf e_a) with components e_a^{\hphantom a\mu}:=(e_a)^\mu in the coordinate system, that is:

    \[\mathbf e_a = e_a^{\hphantom a\mu}\boldsymbol\partial_\mu\]

in terms of coordinate basis vectors. I use Latin indices to specify vectors in the tetrad frame, and add a hat for orthonormal frames. I use Greek indices for coordinate components, for example (e_0)^\mu for the vector \mathbf e_0. (In place of our e_a^{\hphantom a\mu}, de Felice & Clarke write \lambda_{\hat a}^{\hphantom ai}, and Carroll swaps the index order to e^\mu_{\hphantom\mu a}.) In a CAS we can implement the frame as a 4\times 4 array / matrix called “\texttt{frame}” say, reading the indices of e_a^{\hphantom a\mu} from left to right but ignoring their up-or-down placement. This ordering conveniently gives an array of “vectors”:

    \[\texttt{frame} := \big((e_0^{\hphantom 00},\ldots,e_0^{\hphantom 03}),(\cdots),(\cdots),(\cdots)\big)\]

However there is a tradeoff that vectors are placed in rows instead of the more standard column vector representation, because matrix indices refer to the row first and column second. We also define quantities (e^b_{\hphantom b\mu}) implemented as a matrix “\texttt{dualframe}“, which give the coordinate basis vectors in terms of the new frame:

    \[\partial_\mu = e^b_{\hphantom b\mu}\mathbf e_b\]

It follows from linear independence that e_a^{\hphantom a\mu}e^b_{\hphantom b\mu}=\delta_a^b, hence as matrices: \texttt{dualframe}=(\texttt{frame}^\top)^{-1}. The transpose is required because of the index summation order, since the convention for matrix multiplication is (AB)_{ij}:=\sum A_{ik}B_{kj}. This point could easily be missed when references call it “inverse” with more general index summation in mind. Note summing over the Latin indices also returns the identity: e_a^{\hphantom a\mu}e^a_{\hphantom a\nu}=\delta^\mu_\nu.

Now suppose a vector \mathbf q is specified by its coordinate basis components q^\mu, which we implement as a 4-element array \texttt{Q}. Since \mathbf q=q^\mu\boldsymbol\partial_\mu, substituting the previous expression for \partial_\mu and using linear independence gives the components in the new frame (note the Latin index) as: q^b = e^b_{\hphantom b\mu}q^\mu. Programmatically this is the matrix multiplication \texttt{dualframe*Q}, at least for my CAS does not distinguish between a row and column vector but automatically matches the dimensions. Now suppose we have m different vectors, stored in an m\times 4 matrix \texttt{QQ} say (typically m=4). These are processed in a batch operation by converting to column vectors, applying the transformation, then transposing back, so the components are: (\texttt{dualframe*QQ}^\top)^\top = \texttt{QQ*dualframe}^\top, in the new frame.

Now consider the dual bases. In the coordinate dual basis, the vector dual to \mathbf q has components q_\mu = g_{\mu\nu}q^\nu. These components can be implemented as an array \texttt{dualQ} = \texttt{G*Q} where \texttt{G} is the matrix (g_{\mu\nu}) and the row / column vector distinction is ignored as before. Again we can lower multiple vectors in one step via (\texttt{G*QQ}^\top)^\top = \texttt{QQ*G}.

The dual to the new frame satisfies \mathbf e^b(\mathbf e_a) = \delta^b_a by definition, hence

    \[\mathbf e^b = e^b_{\hphantom b\mu}dx^\mu\]

which may be validated by substitution, and these components are just \texttt{dualframe} again. Similarly

    \[dx^\mu = e_a^{\hphantom a\mu}\mathbf e^a\]

which are the components \texttt{frame} again. Carroll’s description for an orthonormal frame is true for any frame:

The vielbeins [e^b_{\hphantom b\mu}] thus serve double duty as the components of the coordinate basis vectors in terms of the orthonormal basis vectors, and as components of the orthonormal basis one-forms in terms of the coordinate basis one-forms; while the inverse vielbeins serve as the components of the orthonormal basis vectors in terms of the coordinate basis, and as components of the coordinate basis one-forms in terms of the orthonormal basis.

Likewise Schutz’ (§3.3) description of Lorentz transformations holds more generally:

…components of one-forms transform in exactly the same manner as basis vectors and in the opposite manner to components of vectors.
[…Whereas basis one-forms transform] the same as for components of a vector, and opposite that for components of a one-form.

We may also define (de Felice & Clarke, eqn. 4.2.5):

    \[e_{a\mu} := \mathbf e_a \cdot \partial_\mu\]

which I interpret as a definition. This evaluates to g_{\mu\nu}e_a^{\hphantom a\nu (or \texttt{frame*G}), hence g^{\mu\nu}e_{a\nu} = e_a^{\hphantom a\mu}. Define also

    \[e^{b\nu} := \mathbf e^b \cdot dx^\nu\]

which evaluates to g^{\mu\nu}e^b_{\hphantom b\mu} (or \texttt{dualframe*Ginv}, where \texttt{Ginv} is the matrix (g^{\mu\nu})), hence g_{\mu\nu}e^{b\nu} = e^b_{\hphantom b\mu}. Thus Greek indices are raised and lowered in the familiar way — using the metric components in the coordinate basis). On the other hand the metric components in the new frame are

    \[g_{ab} := \mathbf e_a\cdot\mathbf e_b = g_{\mu\nu}e_a^{\hphantom a\mu}e_b^{\hphantom b\nu}\]

which can be implemented as \texttt{Gframe} := \texttt{frame*G*frame}^\top. In the particular case of an orthonormal frame g_{\hat a\hat b}=\eta_{\hat a\hat b}, so in this case Latin indices are raised and lowered with the Minkowski metric. The metric in the dual frame is

    \[g^{ab} := \mathbf e^a\cdot\mathbf e^b = g^{\mu\nu}e^a_{\hphantom a\mu}e^b_{\hphantom b\nu}\]

so define \texttt{Gdualframe} := \texttt{dualframe*Ginv*dualframe}^\top. These are matrix inverses: g_{ab}g^{bc}=\delta_a^c. We can show Latin indices are raised or lowered using this frame metric, so for example e_b^{\hphantom b\mu} = g_{ab}e^{a\mu}.

With all these definitions of components as metric inner products between quantities, we may wonder if the original frame components can also be expressed this way. Indeed they can: e_a^{\hphantom a\mu} = \mathbf e_a(dx^\mu) and e^a_{\hphantom b\mu} = \mathbf e^a(\partial_\mu), where the vectors and dual vectors are acted on one another. The metric is implicit in the summation, because as a (1,1)-tensor it is just the identity. But the (1,1)-tensor e_b^{\hphantom b\nu}\mathbf e^b\otimes\partial_\nu made from the “frame” components is also just the identity (see Carroll), so it and the metric tensor are equal. Input \mathbf e_a and dx^\mu into this tensor and it indeed returns e_a^{\hphantom a\mu}. We can do similarly with the dual frame.

For higher rank tensors, their components are expressed in the new frame as e.g. (de Felice & Clarke, Hartle §20.3, §21.2)

    \[T^{ab}_{\hphantom{ab}cd} = T^{\mu\nu}_{\hphantom{\mu\nu}\sigma\tau} e^a_{\hphantom a\mu} e^b_{\hphantom b\nu} e_c^{\hphantom c\sigma} e_d^{\hphantom d\tau}\]

My CAS multiplies higher rank “matrices” \texttt{A*B} by contracting the last index of \texttt{A} with the first index of \texttt{B}. Hence we can only change two indices of T by this method, short of reordering the indices halfway through. There is another inbuilt method “TensorContract” which I will relate sometime later. Of course you could just program in the sum manually, but I am seeking an elegant solution for aesthetic satisfaction, also because inbuilt operations are probably more optimised. Finally you can continue to mix Greek and Latin (coordinate and frame) indices, see Carroll and I will add an example later.