Roger Penrose wins Nobel Prize

Roger Penrose has been awarded (half) the 2020 Nobel Prize for Physics, for the discovery that black hole formation is a robust prediction of the general theory of relativity”. Penrose is a remarkable figure, known for his technical brilliance, communication of science to the general public, and having unique views. His main expertise is general relativity:

More than any other individual, it was Roger Penrose who originated the concepts, insights and techniques that have shaped Einstein’s general relativity as we understand and practise it today.

That is from a “biographical sketch” by Werner Israel (which appears as the last 2 pages of research paper ). It accompanies a republication of Penrose’ classic 1969 review on gravitational collapse. At that time, consensus had been building that black holes really are a thing, see for example §7.9 of   for history. I will write more on this another time.

Penrose is artistic. The review cited above contains a full-page drawing of people on rigid platforms, lowering ropes toward an event horizon. It makes a fun background slide during a talk! He also drew some optical illusion “impossible figures”, and corresponded with artist M. C. Escher. He worked on tiling problems, accessible to anyone, yet important mathematically and with physical applications too (quasicrystals). Penrose’ “conformal diagrams” depict spacetime in a way which clearly illustrates its overall structure. His graphical notation for tensors has spread into other fields including quantum computing.

Penrose is a mathematician, so it is natural to wonder how many other mathematicians have won the physics Nobel Prize. Max Born is one example apparently, though he was also a physicist. On the other hand, string theorist Ed Witten is said to be the only physicist to win a Fields Medal, the preeminent prize in mathematics. Penrose has broad interests, and seems to know a lot of physics, based on the topics in his 1100 page The Road to Reality (2007).

I do hear critique of Penrose from quantum physicists, specifically about his model with Lajos Diósi. Indeed one recent paper, coauthored by Diósi curiously enough, says the theory is largely ruled out experimentally. I will trust the consensus of specialists over the lone genius outside their main field. A different claim by Penrose and collaborators concerns circles in the cosmic microwave background, as evidence for his “conformal cyclic cosmology” model. But astrophysicists are skeptical (I heard John Barrow asked about this at one conference). I’ll lean towards Penrose when it comes to a theoretical general relativistic cosmology, but not for statistical analysis of observational data from our universe. Penrose is also criticised for his views on consciousness. Israel is euphemistic: “His views on gravitational interactions as a trigger for quantum state reduction, and on the non-algorithmic character of human intelligence have generated much discussion.”

Yet, with the above acknowledged, we should focus on Penrose’ strengths and main areas of expertise. Israel calls him a “wholly original non-conformist”, which I would not have picked from his demeanour, but helps explain the combination of his towering strengths along with more speculative ideas. One of the many things I have omitted is Penrose’ twistor theory. Personally, I am still trying to understand spinors, a prior concept, but the fact an individual can come up with their own quantum gravity theory which is admired by their peers is a huge achievement. I look forward to reading part of The Road to Reality sometime.

Notes on Bricmont 2016, “Making sense of quantum mechanics”

The book Making sense of quantum mechanics (2016) overviews de Broglie-Bohmian mechanics, and examines its broader implications for quantum mechanics as a whole. I found it a gripping read. The author, philosopher-physicist Jean Bricmont, makes clear and mostly-convincing arguments, refuting many misconceptions about the meaning of quantum mechanics. I was led to the book by the online Stanford Encyclopedia of Philosophy, which recommended it as “a very good discussion”.

The de Broglie-Bohm (dBB) theory uses the wavefunction determined by Schrödinger’s equation, as in ordinary quantum mechanics (QM). It also assumes particles have definite positions and velocities at all times. These trajectories follow the probability current determined from the wavefunction. This contrasts with the standard interpretation of QM, where particles have no definite properties until a measurement or observation is made (except for eigenstates). Both interpretations make identical predictions about the outcomes of experiments, hence there is no experimental test that can distinguish between them. However, the conceptual implications are very important.

In §7, Bricmont sets up a populist history of QM, where Einstein and Schrödinger are dismissed as out of touch, and von Neumann and John Bell prove `hidden variables’ theories cannot exist. Then a cheeky dismissal: “all of the above is historically wrong.” Rather, Einstein was more concerned with non-locality than indeterminism. The view that QM is not complete should not be dismissed, but is a respectable position. Also von Neumann’s proof is overstated, and the community didn’t check it (see Pinch 1977  for the “sorry history”). Yet Bell “saw the impossible done” in the dBB model, and became its strongest proponent. Hence he clearly didn’t think it contradicted his theorem: Bell’s theorem doesn’t rule out hidden variables, only local hidden variables, or something weirder (?).

The Heisenberg uncertainty principle only concerns measurement outcomes, hence does not conflict with dBB (§5.1.8). Bricmont points out “all measurements can in the end be reduced to position measurements.” For example, a Stern-Gerlach device measures spin by whether particles move up or down. Similarly, momentum can be measured by comparing the position at two different times (§5.1.4). Yet even in dBB, many properties including spin cannot have hidden variables (§5.3.4).

Challenges for the theory include uniqueness, locality, and relativity. Concerning uniqueness, there are stochastic theories with random trajectories, and also an infinite number of theories with deterministic trajectories. While all concur with experiment, dBB is claimed as the most natural (§5.4.1). Concerning locality, Bricmont states that since Bell showed “the world is nonlocal, then the nonlocality of the de Broglie–Bohm theory is a quality, not a defect.” (§7.8) “Moreover, the nonlocality is of the right type… to reproduce Bell’s results, but not more, where `more’ might be a nonlocal theory allowing the transmission of messages.” (§5.2.1) But non-locality is a problem for relativity. The “nonlocal causal connections proven by Bell” occur instantaneously in QM, but in relativity simultaneity is relative, so which instant should be used? However this is a problem for quantum physics generally, not just for dBB: (§5.2.2)

…the problem of a genuine Lorentz invariance… in the face of EPR–Bell experiments is probably the biggest problem that theoretical physics faces today…

It is “the deepest unrecognized problem”, at least (§5.4.1; c.f. §8.4). One attempt at a solution is to introduce a preferred foliation (§5.2.2). [I have an idea on this, but it is early days…] At least `delayed choice’ experiments are not an issue, because in dBB “there is no sense in which our present choices affect the past.” (§5.1.4) There do exist Bohmian quantum field theories, though uniqueness is a challenge (§5.2.2).

Bricmont provocatively claims dBB “is a theory, while ordinary quantum mechanics is not” (§5.1.9); it is “not a physical theory” since it only predicts measurement outcomes (§5.3.5). Apparently, many philosophers require a scientific theory to be explanatory as well as descriptive. This includes realists (§3.3). Hence Bricmont calls dBB “the missing theory behind the quantum algorithm.” (§5.3.5) The Copenhagen interpretation, which emphasises the outcome of experiments, is influenced by positivism. However a strong “version of `logical positivism’… is almost universally rejected by philosophers of science nowadays (in part, because of the imprecision of the word `observable’)…” (§7.8; c.f. §8.4). This was news to me. Personally, I put much effort into conceptual understanding, so it is affirming to learn that trends are at least not opposed to this. I appreciate that dBB offers an underlying mechanism behind QM. But this does not imply it is the only deep explanation of the principles of QM, most obviously if `reality’ does not in fact work this way.  🙂

The mere existence of dBB refutes some popular claims about QM, says Bricmont: that QM ends determinism, that observers are special, and that QM can’t be understood. Yet “its main virtue is to clarify our ideas.” (§5.4.2) Bell wrote:

Should it not be taught, not as the only way, but as an antidote to the prevailing complacency? To show us that vagueness, subjectivity, and indeterminism, are not forced on us by experimental facts, but by deliberate theoretical choice?

Personally, my biggest motivation for learning de Broglie-Bohm theory was its definite velocities. In relativity, physical measurements depend on the observer. An observer’s velocity determines how to split spacetime, along with any tensors on it, into separate space and time parts. However the quantum hydrodynamics formulation also involves velocities, and is closer to the mainstream interpretation of QM than dBB (§5.4.1 cites some references). I also wonder how gravity might couple to each approach. dBB naturally suggests a particle’s exact location might gravitate, whereas the hydrodynamics view might suggest the entire wavefunction gravitates. Then, the theories would predict different experimental outcomes after all. Either way, quantum mechanics is now feeling less mysterious and more accessible, so Bell and Bricmont would be pleased.

Quantum and relativity are not completely dissimilar

The historical connections between relativity and quantum mechanics are stronger than I had realised.

Most physicists have heard of Louis de Broglie, the Nobel Prize winner who was an inspiration for the celebrated Schrödinger equation, the basic equation in quantum mechanics. However he achieved more than just the de Broglie wavelength (which one lecturer described as “the distance at which the wavelike nature of particles becomes apparent”). By requiring consistency with relativity (in technical terms, Lorentz covariance), de Broglie waves with a group velocity v must have a phase velocity c^2/v, where c is the speed of light. Hence de Broglie waves, which are a precursor of Schrödinger’s wavefunction, are solidly grounded in relativity. Tsamparlis 2019  §18.11.2 motivates them clearly.

Another point, which is well known, is that Schrödinger first tried a relativistic wave equation: the Klein-Gordon equation. Not having success, he settled for the Schrödinger equation, which is a non-relativistic limit (slow speeds). Today, the Klein-Gordon equation is viewed as an accurate description of some particles, but subject to strong limitations. Greiner 1990  §1 is an oft-cited textbook here. [Edit: I am referring to the single-particle interpretation.] (Personally, I wonder if some of those limitations can be pushed back…)

Another connection is that Einstein made greater contributions to quantum mechanics than most people had realised. It is relativity that Einstein is justly famous for, however he should also be credited as being one of the founders of quantum physics, apparently. (I must read the popular history book Einstein and the quantum .) OK, so his Nobel prize was actually for a quantum effect. But the typical view has been that Einstein didn’t really understand quantum physics, and that he was out of touch: clinging to an obsolete view of reality. But now some are revising this view, claiming Einstein’s concerns about non-locality, determinism, and whether quantum mechanics is complete or needs additions, are respectable intellectual positions. Irrespective of whether one agrees with him, Einstein had a unique and insightful perspective.

Historically, special relativity was published in 1905, whereas quantum mechanics was developed in the 1920s, so it is not unexpected the former influenced the latter. (Today, the influence should be a two-way street, of course.) The historical connections mentioned above suggest the two theories are more similar than I had realised, or at least, less dissimilar. This gives me increased hope that quantum physics and general relativity can be more fully reconciled, which has been the physics dream for a century now.

Relative velocity in general relativity

Suppose we have two 4-velocity vectors \mathbf u and \mathbf v at the same point in curved spacetime. (This avoids complications such as parallel transport. Physically, think of the two objects as not necessarily overlapping, but close enough that we can neglect curvature etc.) We can calculate the relative velocity each determines of the other, which is not \mathbf u-\mathbf v.

Consider firstly inertial frames in Minkowski spacetime. Using coordinates (t,x,y,z) corresponding to some observer \mathbf v, the components of a different observer \mathbf u satisfy:

    \[u^\mu = \frac{dx^\mu}{d\tau} = \frac{dt}{d\tau}\frac{dx^\mu}{dt} = \gamma(1,\beta_x,\beta_y,\beta_z).\]

Here \tau is \mathbf u‘s proper time, \gamma := -\mathbf u\cdot\mathbf v := -g_{\mu\nu}u^\mu v^\nu is the Lorentz factor as I have discussed previously, and the \beta_i are the relative speeds in the coordinate directions. This calculation is inspired by Tsamparlis 2019  §6.2.

With a view to generalisation, we re-express the displayed formula above using vectors in place of coordinate components: \mathbf u = \gamma(\mathbf v+\mathbf u_\textrm{rel}). This is more elegant, and explicitly tensorial. The reader may find better notation than \mathbf u_\textrm{rel}, but this is the relative velocity of \mathbf u from \mathbf v‘s frame. Rearranging,

    \[\boxed{\mathbf u_\textrm{rel} = \gamma^{-1}\mathbf u - \mathbf v.}\]

This vector lies in the local 3-space of \mathbf v, since \mathbf v\cdot\mathbf u_\textrm{rel} = 0, so in particular \mathbf u_\textrm{rel} is spatial. It has length \beta, which is the overall relative speed, and satisfies \gamma = (1-\beta^2)^{-1/2}. If you want, for \beta\ne 0 there is also a decomposition \mathbf u_\textrm{rel} = \beta\hat{\mathbf n}, where \hat{\mathbf n} is a unit vector. Conversely, the relative velocity of \mathbf v with respect to \mathbf u is \mathbf v_\textrm{rel} = \gamma^{-1}\mathbf v - \mathbf u. This also has length \beta, but lies in \mathbf u‘s 3-space. Why is \mathbf u_\textrm{rel} \ne -\mathbf v_\textrm{rel} unlike in Newtonian physics, aside from the trivial case \mathbf u = \mathbf v? They are in different frames (Tsamparlis §6.4). But with the appropriate Lorentz boost map, such an identity is recovered (Jantzen+ 1992  §4).

All the vector formulae above transfer unchanged to curved spacetime, for 4-velocities at the same event, including worldlines with acceleration. This can be justified using local inertial coordinates. While the formulae do appear in the literature (Jantzen+ 1992; Bini 2014  §6, etc), the topic of observer measurements in general is not widely promoted. I recall two separate conversations with senior relativists who were unfamiliar with use of the Lorentz factor in a curved spacetime context.

In contrast, one quantity which should not be naively ported across from special relativity is acceleration. In curved spacetime, the 4-acceleration \nabla_{\mathbf u}\mathbf u of a particle requires the covariant derivative, which depends on curvature. Relative acceleration between observers is more complicated, as it depends on one’s choice of affine connection, for which there are various natural options. For instance, Fermi-Walker transport, or co-rotation with an observer’s frame, see e.g. Jantzen, Carini & Bini (Jantzen+ 1992; Jantzen+ 1995 ; Jantzen+ 2013  draft).

[Finally, the expression dt/d\tau = \gamma given near the start bothered me at first, because time-dilation is mutual, so one might equally argue a case for \gamma^{-1}. But the key point is, the derivative occurs along the direction of \mathbf u, not \mathbf v. Another way to check the expression is to write dt/d\tau = dt(\mathbf u) = -\mathbf v^\flat(\mathbf u) = \gamma. This is a contraction of the 1-form dt with the vector \mathbf u, as I explained previously. The “flat” symbol just means dt is the 1-form dual to -\mathbf v. Conversely, along \mathbf v we have d\tau/dt = -\mathbf u^\flat(\mathbf v) = \gamma, and this is not a contradiction!]

Derivative as contraction of a 1-form and vector

Suppose you seek the derivative of a quantity along a curve, such as the rate of change of a scalar by proper time along a worldline: d\Phi/d\tau, or perhaps the rate of change of pressure by proper distance along a given spatial direction: dp/ds. These derivatives are conveniently expressed as a contraction between a 1-form (the gradient of the scalar) and a tangent vector to the curve. For the first example,

    \[\frac{d\Phi}{d\tau} = \frac{\partial\Phi}{\partial x^\mu} \frac{dx^\mu}{d\tau} = (d\Phi)_\mu u^\mu = d\Phi(\mathbf u),\]

where (x^\mu) is some coordinate system, d\Phi is the 1-form with components (d\Phi)_\mu = \Phi_{,\mu} = \partial\Phi/\partial x^\mu, and u^\mu = dx^\mu/d\tau is the 4-velocity. d\Phi(\mathbf u) is the contraction of the vector and 1-form, yielding a scalar. Schutz 2009  §3.3 gives this derivation.

A spacelike path can be parametrised by proper distance. Then d\Phi/ds = d\Phi(\boldsymbol\xi), where \xi^\mu := dx^\mu/ds is the unit tangent vector. An example of a paper which uses this is Gibbons 1972 , for the change in stress along a rigid cable, see the line after his Equation 4.

For a null path there is no natural parameter, at least not without additional context. But for any chosen parameter \lambda, we have d\Phi/d\lambda = d\Phi(\boldsymbol\xi) as before, where \xi^\mu := dx^\mu/d\lambda is the tangent vector. Of course this applies to the other cases as well. Note all these calculations occur within a single tangent space.

Kinematic decomposition: expansion + shear + vorticity

Difficulty:   ★★★☆☆   undergraduate

worldlines showing expansion, shear, and vorticity
Worldlines which collectively exhibit expansion, shear, and vorticity over time.

Suppose you know the motion of some particles / fluid / observers over time, as in the diagram. At each point the gradient of the motion can be decomposed into: expansion, shear, and vorticity. This is known as the kinematic decomposition, and is an important tool in relativity.

Write \mathbf u for the 4-velocity field, then lower its index and take the total covariant derivative: \nabla\mathbf u^\flat (that’s a “flat” symbol not the letter ‘b’), which has components \nabla_a u_b or equivalently u_{b;a}. This is the gradient of the motion, expressed as a (0,2)-tensor. Now apply the spatial projection tensor P^a_{\hphantom ab} := g^a_{\hphantom ab}+u^a u_b to get the purely spatial part of the velocity gradient, meaning the part orthogonal to \mathbf u, and which we label \mathbf B:

    \[B_{ab} := P^c_{\hphantom c a} P^d_{\hphantom d b} u_{c;d} = \nabla_a u_b + u_a\dot u_b.\]

Here \dot u_b is the (dual) 4-acceleration \nabla_{\mathbf u}\mathbf u^\flat, which has components u^a\nabla_a u_b. The projectors remove the time-time, time-space, and space-time parts of the velocity gradient. However only the time-space part is nonvanishing, being -u_a\dot u_b. This follows from substituting the vector \mathbf u into the first slot of \nabla\mathbf u^\flat, which represents the direction of differentiation. Caution: many authors define this as the second slot, so the term would be \cdots\dot u_a u_b instead.

Now the symmetric part of \mathbf B is the expansion tensor \theta_{ab} = \frac{1}{2}(B_{ab}+B_{ba}) =: B_{(ab)}, and the antisymmetric part is the vorticity tensor \omega_{ab} = \frac{1}{2}(B_{ab}-B_{ba}) =: B_{[ab]}. These quantities will be explained shortly. The expansion tensor itself splits into “trace” and “trace-free” parts: \theta_{ab} = \frac{1}{3}\theta P_{ab} + \sigma_{ab}. Here \theta = g^{ab}\theta_{ab} = u^a_{\hphantom{a};a} is the expansion scalar; it is the trace of the expansion tensor, and the divergence of the 4-velocity field. It gives the rate of proportional expansion over time. \sigma_{ab} is the shear tensor. There are alternative formulae but this approach, which is largely inspired by Ellis 1971 , seems most efficient for computer algebra. In summary, the kinematic decomposition is:

    \[\nabla_a u_b = \frac{1}{3}\theta P_{ab} + \sigma_{ab} + \omega_{ab} - u_a\dot u_b.\]

So what is the physical meaning of the quantities? Expansion means the particles move apart over time, or contract in the case of negative expansion. More precisely it is the proportional expansion per unit time. (In this article all quantities are understood as measured in the fluid’s frame, in particular “time” means the proper time along the worldline(s).) The expansion scalar gives the proportional change in volume over time: \theta = V^{-1}\,dV/d\tau. A familiar example is the Lemaître-Hubble parameter H = \theta/3, but in general expansion is both position and direction-dependent. Shear (by itself) involves expansion in some directions but contraction in others. Again, this is a proportional change over time. Shear by itself does not change the volume. The eigenvectors of the shear tensor are the principal axes of shear, and since \sigma_{ab} is real and symmetric one can find an orthogonal basis of eigenvectors. Some potentially misleading language is that the expansion tensor also includes the shear; one can emphasise the isotropic (part of) expansion to distinguish \frac{1}{3}\theta P_{ab} specifically. Finally, vorticity is microscopic rotation, known as curl in 3-dimensions. At each point, it describes the rotation within an “infinitesimal” region around that point. This is distinct from macroscopic rotation, meaning an overall rotation of some extended body, as another website nicely visualises. Vorticity by itself is rigid, so does not change lengths or volume.

Define also the shear and vorticity scalars \sigma^2 = \frac{1}{2}\sigma_{ab}\sigma^{ab} and \omega^2 = \frac{1}{2}\omega_{ab}\omega^{ab}. These are positive-definite measures: \sigma \ge 0 with \sigma = 0 if and only if \sigma_{ab} = 0, and similarly for \omega. There is also a vorticity vector \omega^a which is the axis of local rotation. Much more could be said. There are formulae giving the rates of change of relative distance and direction from a given vantage point, see Ehlers 1961  for instance. There are elegant formulae using the exterior derivative, see Jantzen, Carini & Bini 1992  §2. Note we have only described the kinematics and said nothing of its causes, in particular the Einstein field equations are not assumed.

As for literature, the best textbook presentations I am aware of are Ellis, Maartens & MacCallum 2012  §4.6; and Poisson 2004  §2.3. Two classic papers are Ehlers 1961  §2.1 and Ellis 1971  §2. Translators of Ehlers (1993 , linked earlier) described it as an “outstanding review paper” and that “[d]espite its age, it remains one of the best reviews available in this area.” Ellis was republished in 2009 , also linked previously, along with an editorial note  which reviews applications, and states “[f]ew papers in relativistic cosmology have been as influential and as frequently cited”, despite being “primarily a synthesis… of earlier results”. Newtonian fluid dynamics has a similar decomposition of the velocity gradient \partial_i v_j, see perhaps Ellis or Poisson §2.2. Wainwright & Ellis, eds., 1997  is one source which gives further applications. I have assumed timelike worldlines, but Poisson §2.4, 2.6 treats the null case, for which only some of the kinematic quantities remain unambiguous. Everything I have said here assumes the fluid / particles’ frame of reference, but Larena+ 2011  §2.1 investigate other frames; see also the Jantzen+  paper, or §2.4.8, 3.3.5 of their book draft .

[Updated May 2021 to use the “del” convention for index ordering: \nabla_a u_b.]

Cosmic Cable poster

Here is my recently completed “Cosmic Cable” poster. You can also download a PDF version. Its first appearance is at the GR22 conference in Valencia, Spain, this week.

This work investigates the mechanics of a rigid 1-dimensional object in an arbitrary static, spherically symmetric spacetime. Others have applied such ropes / cables / strings to thermodynamics of black holes (for example scooping up Hawking radiation in a box), or to harvest energy in an expanding universe. It is an interesting exercise in topics which don’t receive a lot of attention, such as extended rigid objects in relativity.

I review the case of static cables, which show a fascinating “redshift” of force effect. I then generalise to a simple case of moving cables, solve for the kinematics, tension, and also the power than can be generated (loosely, this is from a loss of gravitational potential). In case this stuff sounds simple, it is not at all obvious, indeed many papers fail already at the kinematics step. The frame dependence of the quantities is a conceptual challenge.

If you’d like more details I have a proceedings (forthcoming) from the Marcel Grossmann conference of 2018. I will expand this into a longer paper. I would also like to write a pedagogical paper explaining Gibbons (1972)  — his 1-page paper is beautifully concise, yet is hard to understand, has an error (as others have pointed out), and lots of typos.

Difference between special and general relativity?

What is the distinction between special relativity (SR) and general relativity (GR)?

It is sometimes said SR can only handle inertial frames, but enough commentators call this a misconception that I must go along with them. A pedagogical paper  on the arXiv today is one example. Also Carroll (2004, §1.2)  writes,

The notion of acceleration in special relativity has a bad reputation, for no good reason. Of course we were careful, in setting up inertial coordinates, to make sure that particles at rest in such coordinates are unaccelerated. However, once we’ve set up such coordinates, we are free to consider any sort of trajectories for physical particles, whether accelerated or not.

This seems a good definition to me: SR is the use of Minkowski coordinates in Minkowski spacetime. You can describe acceleration, but only from within an inertial frame. For example the classic SR textbook Taylor & Wheeler (1992, §2.4)  states, “special relativity is limited to free-float frames”. But from within such frames, they do analyse accelerating particles, see e.g. §3.2. Similarly Misner, Thorne & Wheeler (1973)  even title their section §6.1, “Accelerated observers can be analyzed using special relativity”.

Another definition could be: SR is what you learn in an SR course. In high school I learned about the Lorentz factor, Lorentz transformations, length-contraction, time-dilation, and composition of boosts in the same spatial direction. Undergraduate SR courses presumably have more content, but the term “SR” would still exclude more advanced material, such as Christoffel symbols perhaps, under this definition.

However some textbooks disagree. Misner, Thorne & Wheeler have a solid presentation of 1-forms (§2), and include Fermi-Walker transported tetrads (§6), both in an “SR” context. Gourgoulhon (2013)  takes it to a whole other level, including self-described “rather advanced topics”. He allows not only arbitrary coordinates but non-coordinate bases (§15.4.3), after all the textbook is titled, “Special relativity in general frames”. Gourgoulhon discusses the stress-energy tensor (§19), relativistic hydrodynamics (§21), and even gravitation via historical scalar field theories on flat spacetime (§22). (Of course the stress-energy tensor doesn’t couple to spacetime curvature in this context, so the Einstein field equations are not satisfied.) Personally I would call all this “Minkowski spacetime” rather than “special relativity”! Then again, it could be a publisher’s decision for marketing purposes.

Finally, another definition of SR would be historical, limited to the scope of early papers including Einstein (1905)  and by Minkowski.

In conclusion, I am happy with the definition of SR as Minkowski spacetime using only global inertial frames. Minkowski coordinates would certainly included, with the metric -dt^2+dx^2+dy^2+dz^2, and even simple alternatives such as spherical coordinates -dt^2+dr^2+r^2(d\theta^2+\sin^2\theta\,d\phi^2), so long as covariant derivatives are not required for a given context. Another time I will discuss the application of SR results in global inertial frames to local orthonormal frames in GR.

Forgotten mechanics from the 1800s

Are there historical areas of physics we have forgotten about? I have been reading a little of Klein & Sommerfeld’s The theory of the top (volume one, 1897). The spinning top might seem just a cute problem. However their work forms a detailed 4-volume set, which took over a decade to complete, and Felix Klein was a leading mathematician. As the translators of a recent English edition (Nagem & Sandri, 2008 ) point out, the book contains one of the earliest occurrences of spinors, applied to the instantaneous position of the top (see #31 of their Translator’s Notes). Also I can’t help but share a quote from Herschel (1851 ), who found a spinning top the best demonstration of the precession of Earth’s rotational axis. This child’s toy:

…becomes an elegant philosophical instrument, and exhibits, in the most beautiful manner, the whole phenomenon.

Nagem & Sandri comment, in #57 of their Translator’s Notes:

It was a great surprise for the translators to find that so many prominent nineteenth-century mathematicians devoted their attention to the statics of rigid bodies, and developed it to such an extent. The subject is now neglected entirely in physics and mathematics, and covered only superficially in engineering curricula.

Two sources they mention are Möbius’ Lehrbuch der statik (“Statics textbook”, 1837 ) which analyses forces on rigid bodies, and Ball’s The theory of screws (1876 ).

You could emphasise that modern physics (relativity and quantum) is revolutionary, and of course that is true. But I prefer to emphasise continuity with earlier physics. In the present we make constant usage of Lagrangian and Hamiltonian mechanics, even though these are approximately 200 years old. The enduring relevance of Newtonian mechanics should need no introduction. Also I was a little amused to see Archimedes’ principle mentioned in a modern quantum + relativistic context: Unruh & Wald (1982)  discuss lowering a box, which contains thermal radiation, on a rope towards a black hole horizon:

The energy delivered to the black hole is minimized when the box is dropped from its “equilibrium point,” i.e., when the tension in the rope is zero. By the Archimedes principle… this occurs when the energy of the box equals the energy of the displaced acceleration radiation.

This buoyancy effect is due to Hawking radiation. This finally resolved a paradox by Bekenstein (1972) , who proposed using a black hole to seemingly convert energy into work with 100% efficiency, which would violate the 2nd law of thermodynamics. For a historical overview, see Israel (1987, §7.10).

So, good work Archimedes! Two millennia and going strong. But which research have we forgotten today?

Hypersurfaces of constant proper time

Suppose a given spacetime has a region filled with worldlines (a timelike congruence), and a foliation defined by hypersurfaces of constant proper time along these worldlines. As a boundary condition, all times can be set to zero on a given initial hypersurface. The question is, will the proper time hypersurfaces remain spacelike? I investigate this for two straightforward examples: static observers in Schwarzschild spacetime, and the rotating disc in Minkowski spacetime.

George Ellis mentions the possibility of the hypersurfaces becoming timelike, in a 2014 paper  on his “evolving block universe” interpretation. The context is cosmology, and the worldlines are (in principle) some coarse-grained flow of matter:

The flow lines are not necessarily orthogonal to the surfaces of constant time. This does not matter: no physical phenomena are directly determined by simultaneity in the usual sense. More than that, the surfaces determined in this way are not even necessarily space-like in an inhomogeneous spacetime. In that case, the implied initial value problem will locally be time-like, and the way it works will need to be rethought.

Perhaps the possibility of proper time hypersurfaces becoming timelike has not been investigated in detail. Presumably Ellis’ superb earlier publication Relativistic Cosmology (2012), coauthored with Maartens and MacCallum, would not discuss it either.

Recall a congruence is proper time synchronisable if and only if it is geodesic and vorticity-free, by Frobenius’ theorem. If so, the gradient of a proper time coordinate T say, is given by the dual vector to the 4-velocity:

    \[dT := -\mathbf u^\flat\]

where the minus sign compensates for the metric signature choice -+++. Now consider the congruence of static observers in Schwarzschild spacetime. These have 4-velocity parallel to the Killing vector field which is timelike at infinity, hence are defined for all r>2M. The dual-velocity is

    \[-\mathbf u^\flat = \sqrt{1-\frac{2M}{r}}\,dt\]

in terms of the Schwarzschild t-coordinate. This is clearly not integrable, as expected because the static observers are accelerating. But it still suggests the proper time coordinate T := t\sqrt{1-2M/r}. Then the gradient covector is

    \[dT = \sqrt{1-\frac{2M}{r}}\,dt + \frac{Mt}{r^2}\Big(1-\frac{2M}{r}\Big)^{-1/2}dr\]

In general this is not orthogonal to the worldlines, and one interpretation is a non-standard simultaneity convention, as discussed in my forthcoming paper  “Time, black holes, and infinity”. However it is still proper time, because dT(\mathbf u) = 1. Using the inverse metric,

    \[dT\cdot dT = -1+\frac{M^2t^2}{r^4}\]

so dT is timelike for |t|/M < (r/M)^2, and since dT is a normal to the hypersurfaces they are spacelike in the same region. The figure below shows three examples on a Penrose diagram. The hypersurfaces are spacelike for sufficiently large r, but become null at |t|/M = (r/M)^2 which is the dotted red line in the diagram below.

Hypersurfaces of constant proper time for static observers in Schwarzschild spacetime
Hypersurfaces of constant proper time for static observers in Schwarzschild spacetime

This makes sense intuitively. Near the horizon, the static observers are heavily gravitationally time-dilated, so for a proper time of e.g. T = 1 occurs well into the “future”. This is seen from the curves bending upwards in the diagram for t>0, and bending downwards for t<0, near the horizon. The claim of being in the “future” has some dependence on one’s choice of simultaneity convention, however once the red line is crossed it is an unambiguous statement because the T = \textrm{const} events are timelike separated. Incidentally T=0 at t=0, but this is just an initial condition, and in general one could define T := t\sqrt{1-2M/r} + h(r,\theta,\phi) for any function h, which is also proper time along the worldlines.

Now consider a rigidly rotating disc in 2+1-dimensional Minkowski spacetime. Using polar coordinates (t,r,\phi), the “4”-velocity of each particle on the disc is

    \[u^\mu = \Big(\frac{1}{\sqrt{1-r^2\Omega^2}},0,\frac{\Omega}{\sqrt{1-r^2\Omega^2}}\Big)\]

where \Omega := d\phi/dt \ge 0 parametrises rotation speed. The previous procedure of deriving a time coordinate wasn’t fully general. Here we expect a proper time coordinate to depend on r and t but not \phi. The proper time runs more slowly (compared to t) towards the edge of the disc, note the disc is bounded by r < 1/\Omega for timelike motion. Also it is well known there is a “time-lag” when trying to define simultaneity around a circle r=\textrm{const}. However one can use non-standard simultaneity (i.e. constant “time” hypersurfaces not orthogonal to the worldlines) to avoid this problem: see Relativity in Rotating Frames (2004), particularly the chapter by Rizzi & Serafini.

Based on u^t \equiv dt/d\tau above, define

    \[\bar t := t\sqrt{1-r^2\Omega^2}\]

This deliberately avoids any angular dependence. The gradient is

    \[d\bar t = \sqrt{1-r^2\Omega^2}\,dt -\frac{\Omega^2 tr}{\sqrt{1-r^2\Omega^2}}\]

One can check d\bar t(\mathbf u) = 1, so this is a proper time coordinate. From the above expression one can quantify an implied non-standard simultaneity convention, but I will avoid this here. The hypersurfaces turn null at |t| = (1-r^2\Omega^2)/r\Omega^2.

The spacetime diagram below represents hypersurfaces of constant proper time \bar t, and is independent of rotation rate due to scaling of the coordinates. The dotted red line is where the hypersurfaces are null; to the left of it they are spacelike.

Hypersurfaces of constant proper time for particles on a rotating disc in Minkowski spacetime
Hypersurfaces of constant proper time for particles on a rotating disc in Minkowski spacetime

As \Omega r\rightarrow 1 the hypersurfaces quickly turn null. For small \Omega r, they turn null at \Omega|t|\approx 1/r\Omega. Thus for small rotation / acceleration, the desynchronisation is slow but cumulative.

The disc particles are accelerated, so for variety let’s choose an example with vorticity but no acceleration. Take Schwarzschild spacetime, with circular orbits on the coordinate equator \theta = \pi/2. These are valid anywhere outside the photon sphere at r = 3M, not merely the ISCO at r = 6M. The 4-velocity is:

    \[u^\mu = \Bigg( \frac{1}{\sqrt{1-\frac{3M}{r}}},0,0,\frac{\sqrt{\frac{M}{r^3}}}{\sqrt{1-\frac{3M}{r}}} \Bigg)\]

in Schwarzschild cordinates, which suggests a new coordinate \tilde t := t\sqrt{1-3M/r}. This is null at

    \[\frac{|t|}{M} = \frac{2}{3}\Big(\frac{r}{M}\Big)^2 \frac{1-\frac{3M}{r}}{1-\frac{2M}{r}}\]

which occurs instantly in the limit r\rightarrow 3M, and slowly for r\gg 3M.

In conclusion, proper time hypersurfaces can become timelike:

  • quickly, for high acceleration of the worldlines
  • quickly, for high vorticity of the worldlines
  • slowly, for mild but sustained acceleration or vorticity, a cumulative effect

This investigation was sparked by a lunchtime conversation with Pierre Mourier and Prof. David Wiltshire today, at the University of Canterbury in Christchurch, New Zealand. My forthcoming paper “Time, black holes, and infinity” research paper  will discuss simultaneity in Schwarzschild spacetime.

Update (next day): Mourier clarifies that proper time hypersurfaces have been studied, but often in the zero-vorticity case. Hence they remain orthogonal to the worldlines (in a cosmological context it is taken for granted that the worldlines are geodesics). So the issue of turning timelike does not come up. See perhaps §6.6.1 of Relativistic Cosmology as cited above, or look up “synchronous coordinates”. Mourier has also looked at rotation in Minkowski spacetime, different from the rigid rotation example above, and found similar effects. Wiltshire comments he and collaborators have looked at rotation effects in Lemaître-Tolman-Bondi models and van Stockum dust.