# Hypothesis testing with gas rich galaxies

This Thanksgiving, I’d highlight something positive. Recently, Bob Sanders wrote a paper pointing out that gas rich galaxies are strong tests of MOND. The usual fit parameter, the stellar mass-to-light ratio, is effectively negligible when gas dominates. The MOND prediction follows straight from the gas distribution, for which there is no equivalent freedom. We understand the 21 cm spin-flip transition well enough to relate observed flux directly to gas mass.

In any human endeavor, there are inevitably unsung heroes who carry enormous amounts of water but seem to get no credit for it. Sanders is one of those heroes when it comes to the missing mass problem. He was there at the beginning, and has a valuable perspective on how we got to where we are. I highly recommend his books,

In bright spiral galaxies, stars are usually 80% or so of the mass, gas only 20% or less. But in many dwarf galaxies,  the mass ratio is reversed. These are often low surface brightness and challenging to observe. But it is a worthwhile endeavor, as their rotation curve is predicted by MOND with extraordinarily little freedom.

Though gas rich galaxies do indeed provide an excellent test of MOND, nothing in astronomy is perfectly clean. The stellar mass-to-light ratio is an irreducible need-to-know parameter. We also need to know the distance to each galaxy, as we do not measure the gas mass directly, but rather the flux of the 21 cm line. The gas mass scales with flux and the square of the distance (see equation 7E7), so to get the gas mass right, we must first get the distance right. We also need to know the inclination of a galaxy as projected on the sky in order to get the rotation to which we’re fitting right, as the observed line of sight Doppler velocity is only sin(i) of the full, in-plane rotation speed. The 1/sin(i) correction becomes increasingly sensitive to errors as i approaches zero (face-on galaxies).

The mass-to-light ratio is a physical fit parameter that tells us something meaningful about the amount of stellar mass that produces the observed light. In contrast, for our purposes here, distance and inclination are “nuisance” parameters. These nuisance parameters can be, and generally are, measured independently from mass modeling. However, these measurements have their own uncertainties, so one has to be careful about taking these measured values as-is. One of the powerful aspects of Bayesian analysis is the ability to account for these uncertainties to allow for the distance to be a bit off the measured value, so long as it is not too far off, as quantified by the measurement uncertainties. This is what current graduate student Pengfei Li did in Li et al. (2018). The constraints on MOND are so strong in gas rich galaxies that often the nuisance parameters cannot be ignored, even when they’re well measured.

To illustrate what I’m talking about, let’s look at one famous example, DDO 154. This galaxy is over 90% gas. The stars (pictured above) just don’t matter much. If the distance and inclination are known, the MOND prediction for the rotation curve follows directly. Here is an example of a MOND fit from a recent paper:

This is terrible! The MOND fit – essentially a parameter-free prediction – misses all of the data. MOND is falsified. If one is inclined to hate MOND, as many seem to be, then one stops here. No need to think further.

If one is familiar with the ups and downs in the history of astronomy, one might not be so quick to dismiss it. Indeed, one might notice that the shape of the MOND prediction closely tracks the shape of the data. There’s just a little difference in scale. That’s kind of amazing for a theory that is wrong, especially when it is amplifying the green line to predict the red one: it needn’t have come anywhere close.

Here is the fit to the same galaxy using the same data [already] published in Li et al.:

Now we have a good fit, using the same data! How can this be so?

I have not checked what Ren et al. did to obtain their MOND fits, but having done this exercise myself many times, I recognize the slight offset they find as a typical consequence of holding the nuisance parameters fixed. What if the measured distance is a little off?

Distance estimates to DDO 154 in the literature range from 3.02 Mpc to 6.17 Mpc. The formally most accurate distance measurement is 4.04 ± 0.08 Mpc. In the fit shown here, we obtained 3.87 ± 0.16 Mpc. The error bars on these distances overlap, so they are the same number, to measurement accuracy. These data do not falsify MOND. They demonstrate that it is sensitive enough to tell the difference between 3.8 and 4.1 Mpc.

One will never notice this from a dark matter fit. Ren et al. also make fits with self-interacting dark matter (SIDM). The nifty thing about SIDM is that it makes quasi-constant density cores in dark matter halos. Halos of this form are not predicted by “ordinary” cold dark matter (CDM), but often give better fits than either MOND of the NFW halos of dark matter-only CDM simulations. For this galaxy, Ren et al. obtain the following SIDM fit.

This is a great fit. Goes right through the data. That makes it better, right?

Not necessarily. In addition to the mass-to-light ratio (and the nuisance parameters of distance and inclination), dark matter halo fits have [at least] two additional free parameters to describe the dark matter halo, such as its mass and core radius. These parameters are highly degenerate – one can obtain equally good fits for a range of mass-to-light ratios and core radii: one makes up for what the other misses. Parameter degeneracy of this sort is usually a sign that there is too much freedom in the model. In this case, the data are adequately described by one parameter (the MOND fit M*/L, not counting the nuisances in common), so using three (M*/L, Mhalo, Rcore) is just an exercise in fitting a French curve. There is ample freedom to fit the data. As a consequence, you’ll never notice that one of the nuisance parameters might be a tiny bit off.

In other words, you can fool a dark matter fit, but not MOND. Erwin de Blok and I demonstrated this 20 years ago. A common myth at that time was that “MOND is guaranteed to fit rotation curves.” This seemed patently absurd to me, given how it works: once you stipulate the distribution of baryons, the rotation curve follows from a simple formula. If the two don’t match, they don’t match. There is no guarantee that it’ll work. Instead, it can’t be forced.

As an illustration, Erwin and I tried to trick it. We took two galaxies that are identical in the Tully-Fisher plane (NGC 2403 and UGC 128) and swapped their mass distribution and rotation curve. These galaxies have the same total mass and the same flat velocity in the outer part of the rotation curve, but the detailed distribution of their baryons differs. If MOND can be fooled, this closely matched pair ought to do the trick. It does not.

Our failure to trick MOND should not surprise anyone who bothers to look at the math involved. There is a one-to-one relation between the distribution of the baryons and the resulting rotation curve. If there is a mismatch between them, a fit cannot be obtained.

We also attempted to play this same trick on dark matter. The standard dark matter halo fitting function at the time was the pseudo-isothermal halo, which has a constant density core. It is very similar to the halos of SIDM and to the cored dark matter halos produced by baryonic feedback in some simulations. Indeed, that is the point of those efforts: they  are trying to capture the success of cored dark matter halos in fitting rotation curve data.

Dark matter halos with a quasi-constant density core do indeed provide good fits to rotation curves. Too good. They are easily fooled, because they have too many degrees of freedom. They will fit pretty much any plausible data that you throw at them. This is why the SIDM fit to DDO 154 failed to flag distance as a potential nuisance. It can’t. You could double (or halve) the distance and still find a good fit.

This is why parameter degeneracy is bad. You get lost in parameter space. Once lost there, it becomes impossible to distinguish between successful, physically meaningful fits and fitting epicycles.

Astronomical data are always subject to improvement. For example, the THINGS project obtained excellent data for a sample of nearby galaxies. I made MOND fits to all the THINGS (and other) data for the MOND review Famaey & McGaugh (2012). Here’s the residual diagram, which has been on my web page for many years:

These are, by and large, good fits. The residuals have a well defined peak centered on zero.  DDO 154 was one of the THINGS galaxies; lets see what happens if we use those data.

The first thing one is likely to notice is that the THINGS data are much better resolved than the previous generation used above. The first thing I noticed was that THINGS had assumed a distance of 4.3 Mpc. This was prior to the measurement of 4.04, so lets just start over from there. That gives the MOND prediction shown above.

And it is a prediction. I haven’t adjusted any parameters yet. The mass-to-light ratio is set to the mean I expect for a star forming stellar population, 0.5 in solar units in the Sptizer 3.6 micron band. D=4.04 Mpc and i=66 as tabulated by THINGS. The result is pretty good considering that no parameters have been harmed in the making of this plot. Nevertheless, MOND overshoots a bit at large radii.

Constraining the inclinations for gas rich dwarf galaxies like DDO 154 is a bit of a nightmare. Literature values range from 20 to 70 degrees. Seriously. THINGS itself allows the inclination to vary with radius; 66 is just a typical value. Looking at the fit Pengfei obtained, i=61. Let’s try that.

The fit is now satisfactory. One tweak to the inclination, and we’re done. This tweak isn’t even a fit to these data; it was adopted from Pengfei’s fit to the above data. This tweak to the inclination is comfortably within any plausible assessment of the uncertainty in this quantity. The change in sin(i) corresponds to a mere 4% in velocity. I could probably do a tiny bit better with further adjustment – I have left both the distance and the mass-to-light ratio fixed – but that would be a meaningless exercise in statistical masturbation. The result just falls out: no muss, no fuss.

Hence the point Bob Sanders makes. Given the distribution of gas, the rotation curve follows. And it works, over and over and over, within the bounds of the uncertainties on the nuisance parameters.

One cannot do the same exercise with dark matter. It has ample ability to fit rotation curve data, once those are provided, but zero power to predict it. If all had been well with ΛCDM, the rotation curves of these galaxies would look like NFW halos. Or any number of other permutations that have been discussed over the years. In contrast, MOND makes one unique prediction (that was not at all anticipated in dark matter), and that’s what the data do. Out of the huge parameter space of plausible outcomes from the messy hierarchical formation of galaxies in ΛCDM, Nature picks the one that looks exactly like MOND.

It is a bad sign for a theory when it can only survive by mimicking its alternative. This is the case here: ΛCDM must imitate MOND. There are now many papers asserting that it can do just this, but none of those were written before the data were provided. Indeed, I consider it to be problematic that clever people can come with ways to imitate MOND with dark matter. What couldn’t it imitate? If the data had all looked like technicolor space donkeys, we could probably find a way to make that so as well.

Cosmologists will rush to say “microwave background!” I have some sympathy for that, because I do not know how to explain the microwave background in a MOND-like theory. At least I don’t pretend to, even if I had more predictive success there than their entire community. But that would be a much longer post.

For now, note that the situation is even worse for dark matter than I have so far made it sound. In many dwarf galaxies, the rotation velocity exceeds that attributable to the baryons (with Newton alone) at practically all radii. By a lot. DDO 154 is a very dark matter dominated galaxy. The baryons should have squat to say about the dynamics. And yet, all you need to know to predict the dynamics is the baryon distribution. The baryonic tail wags the dark matter dog.

But wait, it gets better! If you look closely at the data, you will note a kink at about 1 kpc, another at 2, and yet another around 5 kpc. These kinks are apparent in both the rotation curve and the gas distribution. This is an example of Sancisi’s Law: “For any feature in the luminosity profile there is a corresponding feature in the rotation curve and vice versa.” This is a general rule, as Sancisi observed, but it makes no sense when the dark matter dominates. The features in the baryon distribution should not be reflected in the rotation curve.

The observed baryons orbit in a disk with nearly circular orbits confined to the same plane. The dark matter moves on eccentric orbits oriented every which way to provide pressure support to a quasi-spherical halo. The baryonic and dark matter occupy very different regions of phase space, the six dimensional volume of position and momentum. The two are not strongly coupled, communicating only by the weak force of gravity in the standard CDM paradigm.

One of the first lessons of galaxy dynamics is that galaxy disks are subject to a variety of instabilities that grow bars and spiral arms. These are driven by disk self-gravity. The same features do not appear in elliptical galaxies because they are pressure supported, 3D blobs. They don’t have disks so they don’t have disk self-gravity, much less the features that lead to the bumps and wiggles observed in rotation curves.

Elliptical galaxies are a good visual analog for what dark matter halos are believed to be like. The orbits of dark matter particles are unable to sustain features like those seen in  baryonic disks. They are featureless for the same reasons as elliptical galaxies. They don’t have disks. A rotation curve dominated by a spherical dark matter halo should bear no trace of the features that are seen in the disk. And yet they’re there, often enough for Sancisi to have remarked on it as a general rule.

It gets worse still. One of the original motivations for invoking dark matter was to stabilize galactic disks: a purely Newtonian disk of stars is not a stable configuration, yet the universe is chock full of long-lived spiral galaxies. The cure was to place them in dark matter halos.

The problem for dwarfs is that they have too much dark matter. The halo stabilizes disks by  suppressing the formation of structures that stem from disk self-gravity. But you need some disk self-gravity to have the observed features. That can be tuned to work in bright spirals, but it fails in dwarfs because the halo is too massive. As a practical matter, there is no disk self-gravity in dwarfs – it is all halo, all the time. And yet, we do see such features. Not as strong as in big, bright spirals, but definitely present. Whenever someone tries to analyze this aspect of the problem, they inevitably come up with a requirement for more disk self-gravity in the form of unphysically high stellar mass-to-light ratios (something I predicted would happen). In contrast, this is entirely natural in MOND (see, e.g., Brada & Milgrom 1999 and Tiret & Combes 2008), where it is all disk self-gravity since there is no dark matter halo.

The net upshot of all this is that it doesn’t suffice to mimic the radial acceleration relation as many simulations now claim to do. That was not a natural part of CDM to begin with, but perhaps it can be done with smooth model galaxies. In most cases, such models lack the resolution to see the features seen in DDO 154 (and in NGC 1560 and in IC 2574, etc.) If they attain such resolution, they better not show such features, as that would violate some basic considerations. But then they wouldn’t be able to describe this aspect of the data.

Simulators by and large seem to remain sanguine that this will all work out. Perhaps I have become too cynical, but I recall hearing that 20 years ago. And 15. And ten… basically, they’ve always assured me that it will work out even though it never has. Maybe tomorrow will be different. Or would that be the definition of insanity?

# Dwarf Satellite Galaxies. II. Non-equilibrium effects in ultrafaint dwarfs

I have been wanting to write about dwarf satellites for a while, but there is so much to tell that I didn’t think it would fit in one post. I was correct. Indeed, it was worse than I thought, because my own experience with low surface brightness (LSB) galaxies in the field is a necessary part of the context for my perspective on the dwarf satellites of the Local Group. These are very different beasts – satellites are pressure supported, gas poor objects in orbit around giant hosts, while field LSB galaxies are rotating, gas rich galaxies that are among the most isolated known. However, so far as their dynamics are concerned, they are linked by their low surface density.

Where we left off with the dwarf satellites, circa 2000, Ursa Minor and Draco remained problematic for MOND, but the formal significance of these problems was not great. Fornax, which had seemed more problematic, was actually a predictive success: MOND returned a low mass-to-light ratio for Fornax because it was full of young stars. The other known satellites, Carina, Leo I, Leo II, Sculptor, and Sextans, were all consistent with MOND.

The Sloan Digital Sky Survey resulted in an explosion in the number of satellites galaxies discovered around the Milky Way. These were both fainter and lower surface brightness than the classical dwarfs named above. Indeed, they were often invisible as objects in their own right, being recognized instead as groupings of individual stars that shared the same position in space and – critically – velocity. They weren’t just in the same place, they were orbiting the Milky Way together. To give short shrift to a long story, these came to be known as ultrafaint dwarfs.

Ultrafaint dwarf satellites have fewer than 100,000 stars. That’s tiny for a stellar system. Sometimes they had only a few hundred. Most of those stars are too faint to see directly. Their existence is inferred from a handful of red giants that are actually observed. Where there are a few red giants orbiting together, there must be a source population of fainter stars. This is a good argument, and it is likely true in most cases. But the statistics we usually rely on become dodgy for such small numbers of stars: some of the ultrafaints that have been reported in the literature are probably false positives. I have no strong opinion on how many that might be, but I’d be really surprised if it were zero.

Nevertheless, assuming the ultrafaints dwarfs are self-bound galaxies, we can ask the same questions as before. I was encouraged to do this by Joe Wolf, a clever grad student at UC Irvine. He had a new mass estimator for pressure supported dwarfs that we decided to apply to this problem. We used the Baryonic Tully-Fisher Relation (BTFR) as a reference, and looked at it every which-way. Most of the text is about conventional effects in the dark matter picture, and I encourage everyone to read the full paper. Here I’m gonna skip to the part about MOND, because that part seems to have been overlooked in more recent commentary on the subject.

For starters, we found that the classical dwarfs fall along the extrapolation of the BTFR, but the ultrafaint dwarfs deviate from it.

The deviation is not subtle, at least not in terms of mass. The ultrataints had characteristic circular velocities typical of systems 100 times their mass! But the BTFR is steep. In terms of velocity, the deviation is the difference between the 8 km/s typically observed, and the ~3 km/s needed to put them on the line. There are a large number of systematic effects errors that might arise, and all act to inflate the characteristic velocity. See the discussion in the paper if you’re curious about such effects; for our purposes here we will assume that the data cannot simply be dismissed as the result of systematic errors, though one should bear in mind that they probably play a role at some level.

Taken at face value, the ultrafaint dwarfs are a huge problem for MOND. An isolated system should fall exactly on the BTFR. These are not isolated systems, being very close to the Milky Way, so the external field effect (EFE) can cause deviations from the BTFR. However, these are predicted to make the characteristic internal velocities lower than the isolated case. This may in fact be relevant for the red points that deviate a bit in the plot above, but we’ll return to that at some future point. The ultrafaints all deviate to velocities that are too high, the opposite of what the EFE predicts.

The ultrafaints falsify MOND! When I saw this, all my original confirmation bias came flooding back. I had pursued this stupid theory to ever lower surface brightness and luminosity. Finally, I had found where it broke. I felt like Darth Vader in the original Star Wars:

The first draft of my paper with Joe included a resounding renunciation of MOND. No way could it escape this!

But…

I had this nagging feeling I was missing something. Darth should have looked over his shoulder. Should I?

Surely I had missed nothing. Many people are unaware of the EFE, just as we had been unaware that Fornax contained young stars. But not me! I knew all that. Surely this was it.

Nevertheless, the nagging feeling persisted. One part of it was sociological: if I said MOND was dead, it would be well and truly buried. But did it deserve to be? The scientific part of the nagging feeling was that maybe there had been some paper that addressed this, maybe a decade before… perhaps I’d better double check.

Indeed, Brada & Milgrom (2000) had run numerical simulations of dwarf satellites orbiting around giant hosts. MOND is a nonlinear dynamical theory; not everything can be approximated analytically. When a dwarf satellite is close to its giant host, the external acceleration of the dwarf falling towards its host can exceed the internal acceleration of the stars in the dwarf orbiting each other – hence the EFE. But the EFE is not a static thing; it varies as the dwarf orbits about, becoming stronger on closer approach. At some point, this variation becomes to fast for the dwarf to remain in equilibrium. This is important, because the assumption of dynamical equilibrium underpins all these arguments. Without it, it is hard to know what to expect short of numerically simulating each individual dwarf. There is no reason to expect them to remain on the equilibrium BTFR.

Brada & Milgrom suggested a measure to gauge the extent to which a dwarf might be out of equilibrium. It boils down to a matter of timescales. If the stars inside the dwarf have time to adjust to the changing external field, a quasi-static EFE approximation might suffice. So the figure of merit becomes the ratio of internal orbits per external orbit. If the stars inside a dwarf are swarming around many times for every time it completes an orbit around the host, then they have time to adjust. If the orbit of the dwarf around the host is as quick as the internal motions of the stars within the dwarf, not so much. At some point, a satellite becomes a collection of associated stars orbiting the host rather than a self-bound object in its own right.

Brada & Milgrom provide the formula to compute the ratio of orbits, shown in the figure above. The smaller the ratio, the less chance an object has to adjust, and the more subject it is to departures from equilibrium. Remarkably, the amplitude of deviation from the BTFR – the problem I could not understand initially – correlates with the ratio of orbits. The more susceptible a dwarf is to disequilibrium effects, the farther it deviated from the BTFR.

This completely inverted the MOND interpretation. Instead of falsifying MOND, the data now appeared to corroborate the non-equilibrium prediction of Brada & Milgrom. The stronger the external influence, the more a dwarf deviated from the equilibrium expectation. In conventional terms, it appeared that the ultrafaints were subject to tidal stirring: their internal velocities were being pumped up by external influences. Indeed, the originally problematic cases, Draco and Ursa Minor, fall among the ultrafaint dwarfs in these terms. They can’t be in equilibrium in MOND.

If the ultrafaints are out of equilibrium, the might show some independent evidence of this. Stars should leak out, distorting the shape of the dwarf and forming tidal streams. Can we see this?

A definite maybe:

The dwarfs that are more subject to external influence tend to be more elliptical in shape. A pressure supported system in equilibrium need not be perfectly round, but one departing from equilibrium will tend to get stretched out. And indeed, many of the ultrafaints look Messed Up.

I am not convinced that all this requires MOND. But it certainly doesn’t falsify it. Tidal disruption can happen in the dark matter context, but it happens differently. The stars are buried deep inside protective cocoons of dark matter, and do not feel tidal effects much until most of the dark matter is stripped away. There is no reason to expect the MOND measure of external influence to apply (indeed, it should not), much less that it would correlate with indications of tidal disruption as seen above.

This seems to have been missed by more recent papers on the subject. Indeed, Fattahi et al. (2018) have reconstructed very much the chain of thought I describe above. The last sentence of their abstract states “In many cases, the resulting velocity dispersions are inconsistent with the predictions from Modified Newtonian Dynamics, a result that poses a possibly insurmountable challenge to that scenario.” This is exactly what I thought. (I have you now.) I was wrong.

Fattahi et al. are wrong for the same reasons I was wrong. They are applying equilibrium reasoning to a non-equilibrium situation. Ironically, the main point of the their paper is that many systems can’t be explained with dark matter, unless they are tidally stripped – i.e., the result of a non-equilibrium process. Oh, come on. If you invoke it in one dynamical theory, you might want to consider it in the other.

To quote the last sentence of our abstract from 2010, “We identify a test to distinguish between the ΛCDM and MOND based on the orbits of the dwarf satellites of the Milky Way and how stars are lost from them.” In ΛCDM, the sub-halos that contain dwarf satellites are expected to be on very eccentric orbits, with all the damage from tidal interactions with the host accruing during pericenter passage. In MOND, substantial damage may accrue along lower eccentricity orbits, leading to the expectation of more continuous disruption.

Gaia is measuring proper motions for stars all over the sky. Some of these stars are in the dwarf satellites. This has made it possible to estimate orbits for the dwarfs, e.g., work by Amina Helmi (et al!) and Josh Simon. So far, the results are definitely mixed. There are more dwarfs on low eccentricity orbits than I had expected in ΛCDM, but there are still plenty that are on high eccentricity orbits, especially among the ultrafaints. Which dwarfs have been tidally affected by interactions with their hosts is far from clear.

In short, reality is messy. It is going to take a long time to sort these matters out. These are early days.

# Astronomical Acceleration Scales

A quick note to put the acceleration discrepancy in perspective.

The acceleration discrepancy, as Bekenstein called it, more commonly called the missing mass or dark matter problem, is the deviation of dynamics from those of Newton and Einstein. The quantity D is the amplitude of the discrepancy, basically the ratio of total mass to that which is visible. The need for dark matter – the discrepancy – only manifests at very low accelerations, of order 10-10 m/s/s. That’s one part in 1011 of what you feel standing on the Earth.

Astronomical data span enormous, indeed, astronomical, ranges. This is why astronomers so frequently use logarithmic plots. The abscissa in the plot above spans 25 orders of magnitude, from the lowest accelerations measured in the outskirts of galaxies to the highest conceivable on the surface of a neutron star on the brink of collapse into a black hole. If we put this on a linear scale, you’d see one point (the highest) and all the rest would be crammed into x=0.

Galileo established that the we live in a regime where the acceleration due to gravity is effectively constant; g = 9.8 m/s/s. This suffices to describe the trajectories of projectiles (like baseballs) familiar to everyday experience. At least is suffices to describe the gravity; air resistance plays a non-negligible role as well. But you don’t need Newton’s Universal Law of Gravity; you just need to know everything experiences a downward acceleration of one gee.

As we move to higher altitude and on into space, this ceases to suffice. As Newton taught us, the strength of the gravitational attraction between two bodies decreases as the distance between them increases. The constant acceleration recognized by Galileo was a special case of a more general phenomenon. The surface of the Earth is a [very nearly] constant distance from its center, so gee is [very nearly] constant. Get off the Earth, and that changes.

In the plot above, the acceleration we experience here on the surface of the Earth lands pretty much in the middle of the range known to astronomical observation. This is normal to us. The orbits of the planets in the solar system stretch to lower accelerations: the surface gravity of the Earth exceeds the centripetal force it takes to keep Earth in its orbit around the sun. This decreases outward in the solar system, with Neptune experiencing less than 10-5 m/s/s in its orbit.

We understand the gravity in the solar system extraordinarily well. We’ve been watching the planets orbit for ages. The inner planets, in particular, are so well known that subtle effects have been known for ages. Most famous is the tiny excess precession of the perihelion of the orbit of Mercury, first noted by Le Verrier in 1859 but not satisfactorily* explained until Einstein applied General Relativity to the problem in 1916.

The solar system probes many decades of acceleration accurately, but there are many decades of phenomena beyond the reach of the solar system, both to higher and lower accelerations. Two objects orbiting one another intensely enough for the energy loss due to the emission of gravitational waves to have a measurable effect on their orbit are the two neutron stars that compose the binary pulsar of Hulse & Taylor. Their orbit is highly eccentric, pulling an acceleration of about 270 m/s/s at periastron (closest passage). The gravitational dynamics of the system are extraordinarily well understood, and Hulse & Taylor were awarded the 1993 Nobel prize in physics for this observation that indirectly corroborated the existence of gravitational waves.

Direct detection of gravitational waves was first achieved by LIGO in 2015 (the 2017 Nobel prize). The source of these waves was the merger of a binary pair of black holes, a calamity so intense that it converted the equivalent of 3 solar masses into the energy carried away as gravitational waves. Imagine two 30 solar mass black holes orbiting each other a few hundred km apart 75 times per second just before merging – that equates to a centripetal acceleration of nearly 1011 m/s/s.

We seem to understand gravity well in this regime.

The highest acceleration illustrated in the figure above is the maximum surface gravity of a neutron star, which is just a hair under 1013 m/s/s. Anything more than this collapses to a black hole. The surface of a neutron star is not a place that suffers large mountains to exist, even if by “large” you mean “ant sized.” Good luck walking around in an exoskeleton there! Micron scale crustal adjustments correspond to monster starquakes.

High-end gravitational accelerations are 20 orders of magnitude removed from where the acceleration discrepancy appears. Dark matter is a problem restricted to the regime of tiny accelerations, of order 1 Angstrom/s/s. That isn’t much, but it is roughly what holds a star in its orbit within a galaxy. Sometimes less.

Galaxies show a large and clear acceleration discrepancy. The mob of black points is the radial acceleration relation, compressed to fit on the same graph with the high acceleration phenomena. Whatever happens, happens suddenly at this specific scale.

I also show clusters of galaxies, which follow a similar but offset acceleration relation. The discrepancy sets in a littler earlier for them (and with more scatter, but that may simply be a matter of lower precision). This offset from galaxies is a small matter on the scale considered here, but it is a serious one if we seek to modify dynamics at a universal acceleration scale. Depending on how one chooses to look at this aspect of the problem, the data for clusters are either tantalizingly close to the [far superior] data for galaxies, or they are impossibly far removed. Regardless of which attitude proves to be less incorrect, it is clear that the missing mass phenomena is restricted to low accelerations. Everything is normal until we reach the lowest decade or two of accelerations probed by current astronomical data – and extragalactic data are the only data that test gravity in this regime.

We have no other data that probe the very low acceleration regime. The lowest acceleration probe we have with solar system accuracy is from the Pioneer spacecraft. These suffer an anomalous acceleration whose source was debated for many years. Was it some subtle asymmetry in the photon pressure due thermal radiation from the spacecraft? Or new physics?

Though the effect is tiny (it is shown in the graph above, but can you see it?), it would be enormous for a MOND effect. MOND asymptotes to Newton at high accelerations. Despite the many AU Pioneer has put between itself and home, it is still in a regime 4 orders of magnitude above where MOND effects kick in. This would only be perceptible if the asymptotic approach to the Newtonian regime were incredibly slow. So slow, in fact, that it should be perceptible in the highly accurate data for the inner planets. Nowadays, the hypothesis of asymmetric photon pressure is widely accepted, which just goes to show how hard it is to construct experiments to test MOND. Not only do you have to get far enough away from the sun to probe the MOND regime (about a tenth of a light-year), but you have to control for how hard itty-bitty photons push on your projectile.

That said, it’d still be great experiment. Send a bunch of test particles out of the solar system at high speed on a variety of ballistic trajectories. They needn’t be much more than bullets with beacons to track them by. It would take a heck of a rocket to get them going fast enough to return an answer within a lifetime, but rocket scientists love a challenge to go real fast.

*Le Verrier suggested that the effect could be due to a new planet, dubbed Vulcan, that orbited the sun interior to the orbit of Mercury. In the half century prior to Einstein settling the issue, there were many claims to detect this Victorian form of dark matter.

# Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

The Milky Way and its nearest giant neighbor Andromeda (M31) are surrounded by a swarm of dwarf satellite galaxies. Aside from relatively large beasties like the Large Magellanic Cloud or M32, the majority of these are the so-called dwarf spheroidals. There are several dozen examples known around each giant host, like the Fornax dwarf pictured above.

Dwarf Spheroidal (dSph) galaxies are ellipsoidal blobs devoid of gas that typically contain a million stars, give or take an order of magnitude. Unlike globular clusters, that may have a similar star count, dSphs are diffuse, with characteristic sizes of hundreds of parsecs (vs. a few pc for globulars). This makes them among the lowest surface brightness systems known.

This subject has a long history, and has become a major industry in recent years. In addition to the “classical” dwarfs that have been known for decades, there have also been many comparatively recent discoveries, often of what have come to be called “ultrafaint” dwarfs. These are basically dSphs with luminosities less than 100,000 suns, sometimes being comprised of as little as a few hundred stars. New discoveries are being made still, and there is reason to hope that the LSST will discover many more. Summed up, the known dwarf satellites are proverbial drops in the bucket compared to their giant hosts, which contain hundreds of billions of stars. Dwarfs could rain in for a Hubble time and not perturb the mass budget of the Milky Way.

Nevertheless, tiny dwarf Spheroidals are excellent tests of theories like CDM and MOND. Going back to the beginning, in the early ’80s, Milgrom was already engaged in a discussion about the predictions of his then-new theory (before it was even published) with colleagues at the IAS, where he had developed the idea during a sabbatical visit. They were understandably skeptical, preferring – as many still do – to believe that some unseen mass was the more conservative hypothesis. Dwarf spheroidals came up even then, as their very low surface brightness meant low acceleration in MOND. This in turn meant large mass discrepancies. If you could measure their dynamics, they would have large mass-to-light ratios. Larger than could be explained by stars conventionally, and larger than the discrepancies already observed in bright galaxies like Andromeda.

This prediction of Milgrom’s – there from the very beginning – is important because of how things change (or don’t). At that time, Scott Tremaine summed up the contrasting expectation of the conventional dark matter picture:

“There is no reason to expect that dwarfs will have more dark matter than bright galaxies.” *

This was certainly the picture I had in my head when I first became interested in low surface brightness (LSB) galaxies in the mid-80s. At that time I was ignorant of MOND; my interest was piqued by the argument of Disney that there could be a lot of as-yet undiscovered LSB galaxies out there, combined with my first observing experiences with the then-newfangled CCD cameras which seemed to have a proclivity for making clear otherwise hard-to-see LSB features. At the time, I was interested in finding LSB galaxies. My interest in what made them rotate came  later.

The first indication, to my knowledge, that dSph galaxies might have large mass discrepancies was provided by Marc Aaronson in 1983. This tentative discovery was hugely important, but the velocity dispersion of Draco (one of the “classical” dwarfs) was based on only 3 stars, so was hardly definitive. Nevertheless, by the end of the ’90s, it was clear that large mass discrepancies were a defining characteristic of dSphs. Their conventionally computed M/L went up systematically as their luminosity declined. This was not what we had expected in the dark matter picture, but was, at least qualitatively, in agreement with MOND.

My own interests had focused more on LSB galaxies in the field than on dwarf satellites like Draco. Greg Bothun and Jim Schombert had identified enough of these to construct a long list of LSB galaxies that served as targets my for Ph.D. thesis. Unlike the pressure-supported ellipsoidal blobs of stars that are the dSphs, the field LSBs we studied were gas rich, rotationally supported disks – mostly late type galaxies (Sd, Sm, & Irregulars). Regardless of composition, gas or stars, low surface density means that MOND predicts low acceleration. This need not be true conventionally, as the dark matter can do whatever the heck it wants. Though I was blissfully unaware of it at the time, we had constructed the perfect sample for testing MOND.

Having studied the properties of our sample of LSB galaxies, I developed strong ideas about their formation and evolution. Everything we had learned – their blue colors, large gas fractions, and low star formation rates – suggested that they evolved slowly compared to higher surface brightness galaxies. Star formation gradually sputtered along, having a hard time gathering enough material to make stars in their low density interstellar media. Perhaps they even formed late, an idea I took a shining to in the early ’90s. This made two predictions: field LSB galaxies should be less strongly clustered than bright galaxies, and should spin slower at a given mass.

The first prediction follows because the collapse time of dark matter halos correlates with their larger scale environment. Dense things collapse first and tend to live in dense environments. If LSBs were low surface density because they collapsed late, it followed that they should live in less dense environments.

I didn’t know how to test this prediction. Fortunately, fellow postdoc and office mate in Cambridge at the time, Houjun Mo, did. It came true. The LSB galaxies I had been studying were clustered like other galaxies, but not as strongly. This was exactly what I expected, and I thought sure we were on to something. All that remained was to confirm the second prediction.

At the time, we did not have a clear idea of what dark matter halos should be like. NFW halos were still in the future. So it seemed reasonable that late forming halos should have lower densities (lower concentrations in the modern terminology). More importantly, the sum of dark and luminous density was certainly less. Dynamics follow from the distribution of mass as Velocity2 ∝ Mass/Radius. For a given mass, low surface brightness galaxies had a larger radius, by construction. Even if the dark matter didn’t play along, the reduction in the concentration of the luminous mass should lower the rotation velocity.

Indeed, the standard explanation of the Tully-Fisher relation was just this. Aaronson, Huchra, & Mould had argued that galaxies obeyed the Tully-Fisher relation because they all had essentially the same surface brightness (Freeman’s law) thereby taking variation in the radius out of the equation: galaxies of the same mass all had the same radius. (If you are a young astronomer who has never heard of Freeman’s law, you’re welcome.) With our LSB galaxies, we had a sample that, by definition, violated Freeman’s law. They had large radii for a given mass. Consequently, they should have lower rotation velocities.

Up to that point, I had not taken much interest in rotation curves. In contrast, colleagues at the University of Groningen were all about rotation curves. Working with Thijs van der Hulst, Erwin de Blok, and Martin Zwaan, we set out to quantify where LSB galaxies fell in relation to the Tully-Fisher relation. I confidently predicted that they would shift off of it – an expectation shared by many at the time. They did not.

I was flummoxed. My prediction was wrong. That of Aaronson et al. was wrong. Poking about the literature, everyone who had made a clear prediction in the conventional context was wrong. It made no sense.

I spent months banging my head against the wall. One quick and easy solution was to blame the dark matter. Maybe the rotation velocity was set entirely by the dark matter, and the distribution of luminous mass didn’t come into it. Surely that’s what the flat rotation velocity was telling us? All about the dark matter halo?

Problem is, we measure the velocity where the luminous mass still matters. In galaxies like the Milky Way, it matters quite a lot. It does not work to imagine that the flat rotation velocity is set by some property of the dark matter halo alone. What matters to what we measure is the combination of luminous and dark mass. The luminous mass is important in high surface brightness galaxies, and progressively less so in lower surface brightness galaxies. That should leave some kind of mark on the Tully-Fisher relation, but it doesn’t.

I worked long and hard to understand this in terms of dark matter. Every time I thought I had found the solution, I realized that it was a tautology. Somewhere along the line, I had made an assumption that guaranteed that I got the answer I wanted. It was a hopeless fine-tuning problem. The only way to satisfy the data was to have the dark matter contribution scale up as that of the luminous mass scaled down. The more stretched out the light, the more compact the dark – in exact balance to maintain zero shift in Tully-Fisher.

This made no sense at all. Over twenty years on, I have yet to hear a satisfactory conventional explanation. Most workers seem to assert, in effect, that “dark matter does it” and move along. Perhaps they are wise to do so.

As I was struggling with this issue, I happened to hear a talk by Milgrom. I almost didn’t go. “Modified gravity” was in the title, and I remember thinking, “why waste my time listening to that nonsense?” Nevertheless, against my better judgement, I went. Not knowing that anyone in the audience worked on either LSB galaxies or Tully-Fisher, Milgrom proceeded to derive the MOND prediction:

“The asymptotic circular velocity is determined only by the total mass of the galaxy: Vf4 = a0GM.”

In a few lines, he derived rather trivially what I had been struggling to understand for months. The lack of surface brightness dependence in Tully-Fisher was entirely natural in MOND. It falls right out of the modified force law, and had been explicitly predicted over a decade before I struggled with the problem.

I scraped my jaw off the floor, determined to examine this crazy theory more closely. By the time I got back to my office, cognitive dissonance had already started to set it. Couldn’t be true. I had more pressing projects to complete, so I didn’t think about it again for many moons.

When I did, I decided I should start by reading the original MOND papers. I was delighted to find a long list of predictions, many of them specifically to do with surface brightness. We had just collected fresh data on LSB galaxies, which provided a new window on the low acceleration regime. I had the data to finally falsify this stupid theory.

Or so I thought. As I went through the list of predictions, my assumption that MOND had to be wrong was challenged by each item. It was barely an afternoon’s work: check, check, check. Everything I had struggled for months to understand in terms of dark matter tumbled straight out of MOND.

I was faced with a choice. I knew this would be an unpopular result. I could walk away and simply pretend I had never run across it. That’s certainly how it had been up until then: I had been blissfully unaware of MOND and its perniciously successful predictions. No need to admit otherwise.

Had I realized just how unpopular it would prove to be, maybe that would have been the wiser course. But even contemplating such a course felt criminal. I was put in mind of Paul Gerhardt’s admonition for intellectual honesty:

“When a man lies, he murders some part of the world.”

Ignoring what I had learned seemed tantamount to just that. So many predictions coming true couldn’t be an accident. There was a deep clue here; ignoring it wasn’t going to bring us closer to the truth. Actively denying it would be an act of wanton vandalism against the scientific method.

Still, I tried. I looked long and hard for reasons not to report what I had found. Surely there must be some reason this could not be so?

Indeed, the literature provided many papers that claimed to falsify MOND. To my shock, few withstood critical examination. Commonly a straw man representing MOND was falsified, not MOND itself. At a deeper level, it was implicitly assumed that any problem for MOND was an automatic victory for dark matter. This did not obviously follow, so I started re-doing the analyses for both dark matter and MOND. More often than not, I found either that the problems for MOND were greatly exaggerated, or that the genuinely problematic cases were a problem for both theories. Dark matter has more flexibility to explain outliers, but outliers happen in astronomy. All too often the temptation was to refuse to see the forest for a few trees.

The first MOND analysis of the classical dwarf spheroidals provides a good example. Completed only a few years before I encountered the problem, these were low surface brightness systems that were deep in the MOND regime. These were gas poor, pressure supported dSph galaxies, unlike my gas rich, rotating LSB galaxies, but the critical feature was low surface brightness. This was the most directly comparable result. Better yet, the study had been made by two brilliant scientists (Ortwin Gerhard & David Spergel) whom I admire enormously. Surely this work would explain how my result was a mere curiosity.

Indeed, reading their abstract, it was clear that MOND did not work for the dwarf spheroidals. Whew: LSB systems where it doesn’t work. All I had to do was figure out why, so I read the paper.

As I read beyond the abstract, the answer became less and less clear. The results were all over the map. Two dwarfs (Sculptor and Carina) seemed unobjectionable in MOND. Two dwarfs (Draco and Ursa Minor) had mass-to-light ratios that were too high for stars, even in MOND. That is, there still appeared to be a need for dark matter even after MOND had been applied. One the flip side, Fornax had a mass-to-light ratio that was too low for the old stellar populations assumed to dominate dwarf spheroidals. Results all over the map are par for the course in astronomy, especially for a pioneering attempt like this. What were the uncertainties?

Milgrom wrote a rebuttal. By then, there were measured velocity dispersions for two more dwarfs. Of these seven dwarfs, he found that

“within just the quoted errors on the velocity dispersions and the luminosities, the MOND M/L values for all seven dwarfs are perfectly consistent with stellar values, with no need for dark matter.”

Well, he would say that, wouldn’t he? I determined to repeat the analysis and error propagation.

The net result: they were both right. M/L was still too high for Draco and Ursa Minor, and still too low for Fornax. But this was only significant at the 2σ level, if that – hardly enough to condemn a theory. Carina, Leo I, Leo II, Sculptor, and Sextans all had fairly reasonable mass-to-light ratios. The voting is different now. Instead of going 2 for 5 as Gerhard & Spergel found, MOND was now 5 for 8. One could choose to obsess about the outliers, or one could choose to see a more positive pattern.  Either a positive or a negative spin could be put on this result. But it was clearly more positive than the first attempt had indicated.

The mass estimator in MOND scales as the fourth power of velocity (or velocity dispersion in the case of isolated dSphs), so the too-high M*/L of Draco and Ursa Minor didn’t disturb me too much. A small overestimation of the velocity dispersion would lead to a large overestimation of the mass-to-light ratio. Just about every systematic uncertainty one can think of pushes in this direction, so it would be surprising if such an overestimate didn’t happen once in a while.

Given this, I was more concerned about the low M*/L of Fornax. That was weird.

Up until that point (1998), we had been assuming that the stars in dSphs were all old, like those in globular clusters. That corresponds to a high M*/L, maybe 3 in solar units in the V-band. Shortly after this time, people started to look closely at the stars in the classical dwarfs with the Hubble. Low and behold, the stars in Fornax were surprisingly young. That means a low M*/L, 1 or less. In retrospect, MOND was trying to tell us that: it returned a low M*/L for Fornax because the stars there are young. So what was taken to be a failing of the theory was actually a predictive success.

Hmm.

And Gee. This is a long post. There is a lot more to tell, but enough for now.

*I have a long memory, but it is not perfect. I doubt I have the exact wording right, but this does accurately capture the sentiment from the early ’80s when I was an undergraduate at MIT and Scott Tremaine was on the faculty there.

# A Precise Milky Way

The Milky Way Galaxy in which we live seems to be a normal spiral galaxy. But it can be hard to tell. Our perspective from within it precludes a “face-on” view like the picture above, which combines some real data with a lot of artistic liberty. Some local details we can measure in extraordinary detail, but the big picture is hard. Just how big is the Milky Way? The absolute scale of our Galaxy has always been challenging to measure accurately from our spot within it.

For some time, we have had a remarkably accurate measurement of the angular speed of the sun around the center of the Galaxy provided by the proper motion of Sagittarius A*. Sgr A* is the radio source associated with the supermassive black hole at the center of the Galaxy. By watching how it appears to move across the sky, Reid & Brunthaler found our relative angular speed to be 6.379 milliarcseconds/year. That’s a pretty amazing measurement: a milliarcsecond is one one-thousandth of one arcsecond, which is one sixtieth of one arcminute, which is one sixtieth of a degree. A pretty small angle.

The proper motion of an object depends on the ratio of its speed to its distance. So this high precision measurement does not itself tell us how big the Milky Way is. We could be far from the center and moving fast, or close and moving slow. Close being a relative term when our best estimates of the distance to the Galactic center hover around 8 kpc (26,000 light-years), give or take half a kpc.

This situation has recently improved dramatically thanks to the Gravity collaboration. They have observed the close passage of a star (S2) past the central supermassive black hole Sgr A*. Their chief interest is in the resulting relativistic effects: gravitational redshift and Schwarzschild precession, which provide a test of General Relativity. Unsurprisingly, it passes with flying colors.

As a consequence of their fitting process, we get for free some other interesting numbers. The mass of the central black hole is 4.1 million solar masses, and the distance to it is 8.122 kpc. The quoted uncertainty is only 31 pc. That’s parsecs, not kiloparsecs. Previously, I had seen credible claims that the distance to the Galactic center was 7.5 kpc. Or 7.9. Or 8.3 Or 8.5. There was a time when it was commonly thought to be about 10 kpc, i.e., we weren’t even sure what column the first digit belonged in. Now we know it to several decimal places. Amazing.

Knowing both the Galactocentric distance and the proper motion of Sgr A* nails down the relative speed of the sun: 245.6 km/s. Of this, 12.2 km/s is “solar motion,” which is how much the sun deviates from a circular orbit. Correcting for this gives us the circular speed of an imaginary test particle orbiting at the sun’s location: 233.3 km/s, accurate to 1.4 km/s.

The distance and circular speed at the solar circle are the long sought Galactic Constants. These specify the scale of the Milky Way. Knowing them also pins down the rotation curve interior to the sun. This is well constrained by the “terminal velocities,” which provide a precise mapping of relative speeds, but need the Galactic Constants for an absolute scale.

A few years ago, I built a model Milky Way rotation curve that fit the terminal velocity data. What I was interested in then was to see if I could use the radial acceleration relation (RAR) to infer the mass distribution of the Galactic disk. The answer was yes. Indeed, it makes for a clear improvement over the traditional approach of assuming a purely exponential disk in the sense that the kinematically inferred bumps and wiggles in the rotation curve correspond to spiral arms known from star counts, as in external spiral galaxies.

Now that the Galactic constants are Known, it seems worth updating the model. This results in the surface density profile

with the corresponding rotation curve

The model data are available from the Milky Way section of my model pages.

Finding a model that matches both the terminal velocity and the highly accurate Galactic constants is no small feat. Indeed, I worried it was impossible: the speed at the solar circle is down to 233 km/s from a high of 249 km/s just a couple of kpc interior. This sort of variation is possible, but it requires a ring of mass outside the sun. This appears to be the effect of the Perseus spiral arm.

For the new Galactic constants and the current calibration of the RAR, the stellar mass of the Milky Way works out to just under 62 billion solar masses. The largest uncertainty in this is from the asymmetry in the terminal velocities, which are slightly different in the first and fourth quadrants. This is likely a real asymmetry in the mass distribution of the Milky Way. Treating it as an uncertainty, the range of variation corresponds to about 5% up or down in stellar mass.

With the stellar mass determined in this way, we can estimate the local density of dark matter. This is the critical number that is needed for experimental searches: just how much of the stuff should we expect? The answer is very precise: 0.257 GeV per cubic cm. This a bit less than is usually assumed, which makes it a tiny bit harder on the hard-working experimentalists.

The accuracy of the dark matter density is harder to assess. The biggest uncertainty is that in stellar mass. We known the total radial force very well now, but how much is due to stars, and how much to dark matter? (or whatever). The RAR provides a unique method for constraining the stellar contribution, and does so well enough that there is very little formal uncertainty in the dark matter density. This, however, depends on the calibration of the RAR, which itself is subject to systematic uncertainty at the 20% level. This is not as bad as it sounds, because a recalibration of the RAR changes its shape in a way that tends to trade off with stellar mass while not much changing the implied dark matter density. So even with these caveats, this is the most accurate measure of the dark matter density to date.

This is all about the radial force. One can also measure the force perpendicular to the disk. This vertical force implies about twice the dark matter density. This may be telling us something about the shape of the dark matter halo – rather than being spherical as usually assumed, it might be somewhat squashed. It is easy to say that, but it seems a strange circumstance: the stars provide most of the restoring force in the vertical direction, and apparently dominate the radial force. Subtracting off the stellar contribution is thus a challenging task: the total force isn’t much greater than that from the stars alone. Subtracting one big number from another to measure a small one is fraught with peril: the uncertainties tend to blow up in your face.

Returning to the Milky Way, it seems in all respects to be a normal spiral galaxy. With the stellar mass found here, we can compare it to other galaxies in scaling relations like Tully-Fisher. It does not stand out from the crowd: our home is a fairly normal place for this time in the Universe.

It is possible to address many more details with a model like this. See the original!

# The next cosmic frontier: 21cm absorption at high redshift

There are two basic approaches to cosmology: start at redshift zero and work outwards in space, or start at the beginning of time and work forward. The latter approach is generally favored by theorists, as much of the physics of the early universe follows a “clean” thermal progression, cooling adiabatically as it expands. The former approach is more typical of observers who start with what we know locally and work outwards in the great tradition of Hubble, Sandage, Tully, and the entire community of extragalactic observers that established the paradigm of the expanding universe and measured its scale. This work had established our current concordance cosmology, ΛCDM, by the mid-90s.*

Both approaches have taught us an enormous amount. Working forward in time, we understand the nucleosynthesis of the light elements in the first few minutes, followed after a few hundred thousand years by the epoch of recombination when the universe transitioned from an ionized plasma to a neutral gas, bequeathing us the cosmic microwave background (CMB) at the phenomenally high redshift of z=1090. Working outwards in redshift, large surveys like Sloan have provided a detailed map of the “local” cosmos, and narrower but much deeper surveys provide a good picture out to z = 1 (when the universe was half its current size, and roughly half its current age) and beyond, with the most distant objects now known above redshift 7, and maybe even at z > 11. JWST will provide a good view of the earliest (z ~ 10?) galaxies when it launches.

This is wonderful progress, but there is a gap from 10 < z < 1000. Not only is it hard to observe objects so distant that z > 10, but at some point they shouldn’t exist. It takes time to form stars and galaxies and the supermassive black holes that fuel quasars, especially when starting from the smooth initial condition seen in the CMB. So how do we probe redshifts z > 10?

It turns out that the universe provides a way. As photons from the CMB traverse the neutral intergalactic medium, they are subject to being absorbed by hydrogen atoms – particularly by the 21cm spin-flip transition. Long anticipated, this signal has recently been detected by the EDGES experiment. I find it amazing that the atomic physics of the early universe allows for this window of observation, and that clever scientists have figured out a way to detect this subtle signal.

So what is going on? First, a mental picture. In the image below, an observer at the left looks out to progressively higher redshift towards the right. The history of the universe unfolds from right to left.

Pritchard & Loeb give a thorough and lucid account of the expected sequence of events. As the early universe expands, it cools. Initially, the thermal photon bath that we now observe as the CMB has enough energy to keep atoms ionized. The mean free path that a photon can travel before interacting with a charged particle in this early plasma is very short: the early universe is opaque like the interior of a thick cloud. At z = 1090, the temperature drops to the point that photons can no longer break protons and electrons apart. This epoch of recombination marks the transition from an opaque plasma to a transparent universe of neutral hydrogen and helium gas. The path length of photons becomes very long; those that we see as the CMB have traversed the length of the cosmos mostly unperturbed.

Immediately after recombination follows the dark ages. Sources of light have yet to appear. There is just neutral gas expanding into the future. This gas is mostly but not completely transparent. As CMB photons propagate through it, they are subject to absorption by the spin-flip transition of hydrogen, a subtle but, in principle, detectable effect: one should see redshifted absorption across the dark ages.

After some time – perhaps a few hundred million years? – the gas has had enough time to clump up enough to start to form the first structures. This first population of stars ends the dark ages and ushers in cosmic dawn. The photons they release into the vast intergalactic medium (IGM) of neutral gas interacts with it and heats it up, ultimately reionizing the entire universe. After this time the IGM is again a plasma, but one so thin (thanks to the expansion of the universe) that it remains transparent. Galaxies assemble and begin the long evolution characterized by the billions of years lived by the stars the contain.

This progression leads to the expectation of 21cm absorption twice: once during the dark ages, and again at cosmic dawn. There are three temperatures we need to keep track of to see how this happens: the radiation temperature Tγ, the kinetic temperature of the gas, Tk, and the spin temperature, TS. The radiation temperature is that of the CMB, and scales as (1+z). The gas temperature is what you normally think of as a temperature, and scales approximately as (1+z)2. The spin temperature describes the occupation of the quantum levels involved in the 21cm hyperfine transition. If that makes no sense to you, don’t worry: all that matters is that absorption can occur when the spin temperature is less than the radiation temperature. In general, it is bounded by Tk < TS < Tγ.

The radiation temperature and gas temperature both cool as the universe expands. Initially, the gas remains coupled to the radiation, and these temperatures remain identical until decoupling around z ~ 200. After this, the gas cools faster than the radiation. The radiation temperature is extraordinarily well measured by CMB observations, and is simply Tγ = (2.725 K)(1+z). The gas temperature is more complicated, requiring the numerical solution of the Saha equation for a hydrogen-helium gas. Clever people have written codes to do this, like the widely-used RECFAST. In this way, one can build a table of how both temperatures depend on redshift in any cosmology one cares to specify.

This may sound complicated if it is the first time you’ve encountered it, but the physics is wonderfully simple. It’s just the thermal physics of the expanding universe, and the atomic physics of a simple gas composed of hydrogen and helium in known amounts. Different cosmologies specify different expansion histories, but these have only a modest (and calculable) effect on the gas temperature.

Wonderfully, the atomic physics of the 21cm transition is such that it couples to both the radiation and gas temperatures in a way that matters in the early universe. It didn’t have to be that way – most transitions don’t. Perhaps this is fodder for people who worry that the physics of our universe is fine-tuned.

There are two ways in which the spin temperature couples to that of the gas. During the dark ages, the coupling is governed simply by atomic collisions. By cosmic dawn collisions have become rare, but the appearance of the first stars provides UV radiation that drives the WouthuysenField effect. Consequently, we expect to see two absorption troughs: one around z ~ 20 at cosmic dawn, and another at still higher redshift (z ~ 100) during the dark ages.

Observation of this signal has the potential to revolutionize cosmology like detailed observations of the CMB did. The CMB is a snapshot of the universe during the narrow window of recombination at z = 1090. In principle, one can make the same sort of observation with the 21cm line, but at each and every redshift where absorption occurs: z = 16, 17, 18, 19 during cosmic dawn and again at z = 50, 100, 150 during the dark ages, with whatever frequency resolution you can muster. It will be like having the CMB over and over and over again, each redshift providing a snapshot of the universe at a different slice in time.

The information density available from the 21cm signal is in principle quite large. Before we can make use of any of this information, we have to detect it first. Therein lies the rub. This is an incredibly weak signal – we have to be able to detect that the CMB is a little dimmer than it would have been – and we have to do it in the face of much stronger foreground signals from the interstellar medium of our Galaxy and from man-made radio interference here on Earth. Fortunately, though much brighter than the signal we seek, these foregrounds have a different frequency dependence, so it should be possible to sort out, in principle.

Saying a thing can be done and doing it are two different things. This is already a long post, so I will refrain from raving about the technical challenges. Lets just say it’s Real Hard.

Many experimentalists take that as a challenge, and there are a good number of groups working hard to detect the cosmic 21cm signal. EDGES appears to have done it, reporting the detection of the signal at cosmic dawn in February. Here some weasel words are necessary, as the foreground subtraction is a huge challenge, and we always hope to see independent confirmation of a new signal like this. Those words of caution noted, I have to add that I’ve had the chance to read up on their methods, and I’m really impressed. Unlike the BICEP claim to detect primordial gravitational waves that proved to be bogus after being rushed to press release before refereering, the EDGES team have done all manner of conceivable cross-checks on their instrumentation and analysis. Nor did they rush to publish, despite the importance of the result. In short, I get exactly the opposite vibe from BICEP, whose foreground subtraction was obviously wrong as soon as I laid eyes on the science paper. If EDGES proves to be wrong, it isn’t for want of doing things right. In the meantime, I think we’re obliged to take their result seriously, and not just hope it goes away (which seems to be the first reaction to the impossible).

Here is what EDGES saw at cosmic dawn:

The unbelievable aspect of the EDGES observation is that it is too strong. Feeble as this signal is (a telescope brightness decrement of half a degree Kelvin), after subtracting foregrounds a thousand times stronger, it is twice as much as is possible in ΛCDM.

I made a quick evaluation of this, and saw that the observed signal could be achieved if the baryon fraction of the universe was high – basically, if cold dark matter did not exist. I have now had the time to make a more careful calculation, and publish some further predictions. The basic result from before stands: the absorption should be stronger without dark matter than with it.

The reason for this is simple. A universe full of dark matter decelerates rapidly at early times, before the acceleration of the cosmological constant kicks in. Without dark matter, the expansion more nearly coasts. Consequently, the universe is relatively larger from 10 < z < 1000, and the CMB photons have to traverse a larger path length to get here. They have to go about twice as far through the same density of hydrogen absorbers. It’s like putting on a second pair of sunglasses.

Quantitatively, the predicted absorption, both with dark matter and without, looks like:

The predicted absorption is consistent with the EDGES observation, within the errors, if there is no dark matter. More importantly, ΛCDM is not consistent with the data, at greater than 95% confidence. At cosmic dawn, I show the maximum possible signal. It could be weaker, depending on the spectra of the UV radiation emitted by the first stars. But it can’t be stronger. Taken at face value, the EDGES result is impossible in ΛCDM. If the observation is corroborated by independent experiments, ΛCDM as we know it will be falsified.

There have already been many papers trying to avoid this obvious conclusion. If we insist on retaining ΛCDM, the only way to modulate the strength of the signal is to alter the ratio of the radiation temperature to the gas temperature. Either we make the radiation “hotter,” or we make the gas cooler. If we allow ourselves this freedom, we can fit any arbitrary signal strength. This is ad hoc in the way that gives ad hoc a bad name.

We do not have this freedom – not really. The radiation temperature is measured in the CMB with great accuracy. Altering this would mess up the genuine success of ΛCDM in fitting the CMB. One could postulate an additional source, something that appears after recombination but before cosmic dawn to emit enough radio power throughout the cosmos to add to the radio brightness that is being absorbed. There is zero reason to expect such sources (what part of `cosmic dawn’ was ambiguous?) and no good way to make them at the right time. If they are primordial (as people love to imagine but are loathe to provide viable models for) then they’re also present at recombination: anything powerful enough to have the necessary effect will likely screw up the CMB.

Instead of magically increasing the radiation temperature, we might decrease the gas temperature. This seems no more plausible. The evolution of the gas temperature is a straightforward numerical calculation that has been checked by several independent codes. It has to be right at the time of recombination, or again, we mess up the CMB. The suggestions that I have heard seem mostly to invoke interactions between the gas and dark matter that offload some of the thermal energy of the gas into the invisible sink of the dark matter. Given how shy dark matter has been about interacting with normal matter in the laboratory, it seems pretty rich to imagine that it is eager to do so at high redshift. Even advocates of this scenario recognize its many difficulties.

For those who are interested, I cite a number of the scientific papers that attempt these explanations in my new paper. They all seem like earnest attempts to come to terms with what is apparently impossible. Many of these ideas also strike me as a form of magical thinking that stems from ΛCDM groupthink. After all, ΛCDM is so well established, any unexpected signal must be a sign of exciting new physics (on top of the new physics of dark matter and dark energy) rather than an underlying problem with ΛCDM itself.

The more natural interpretation is that the expansion history of the universe deviates from that predicted by ΛCDM. Simply taking away the dark matter gives a result consistent with the data. Though it did not occur to me to make this specific prediction a priori for an experiment that did not yet exist, all the necessary calculations had been done 15 years ago.

Using the same model, I make a genuine a priori prediction for the dark ages. For the specific NoCDM model I built in 2004, the 21cm absorption in the dark ages should again be about twice as strong as expected in ΛCDM. This seems fairly generic, but I know the model is not complete, so I wouldn’t be upset if it were not bang on.

I would be upset if ΛCDM were not bang on. The only thing that drives the signal in the dark ages is atomic scattering. We understand this really well. ΛCDM is now so well constrained by Planck that, if right, the 21cm absorption during the dark ages must follow the red line in the inset in the figure. The amount of uncertainty is not much greater than the thickness of the line. If ΛCDM fails this test, it would be a clear falsification, and a sign that we need to try something completely different.

Unfortunately, detecting the 21cm absorption signal during the dark ages is even harder than it is at cosmic dawn. At these redshifts (z ~ 100), the 21cm line (1420 MHz on your radio dial) is shifted beyond the ionospheric cutoff of the Earth’s atmosphere at 30 MHz. Frequencies this low cannot be observed from the ground. Worse, we have made the Earth itself a bright foreground contaminant of radio frequency interference.

Undeterred, there are multiple proposals to measure this signal by placing an antenna in space – in particular, on the far side of the moon, so that the moon shades the instrument from terrestrial radio interference. This is a great idea. The mere detection of the 21cm signal from the dark ages would be an accomplishment on par with the original detection of the CMB. It appears that it might also provide a decisive new way of testing our cosmological model.

There are further tests involving the shape of the 21cm signal, its power spectrum (analogous to the power spectrum of the CMB), how structure grows in the early ages of the universe, and how massive the neutrino is. But that’s enough for now.

Most likely beer. Or a cosmo. That’d be appropriate. I make a good pomegranate cosmo.

*Note that a variety of astronomical observations had established the concordance cosmology before Type Ia supernovae detected cosmic acceleration and well-resolved observations of the CMB found a flat cosmic geometry.

# A brief history of the acceleration discrepancy

As soon as I wrote it, I realized that the title is much more general than anything that can be fit in a blog post. Bekenstein argued long ago that the missing mass problem should instead be called the acceleration discrepancy, because that’s what it is – a discrepancy that occurs in conventional dynamics at a particular acceleration scale. So in that sense, it is the entire history of dark matter. For that, I recommend the excellent book The Dark Matter Problem: A Historical Perspective by Bob Sanders.

Here I mean more specifically my own attempts to empirically constrain the relation between the mass discrepancy and acceleration. Milgrom introduced MOND in 1983, no doubt after a long period of development and refereeing. He anticipated essentially all of what I’m going to describe. But not everyone is eager to accept MOND as a new fundamental theory, and often suffer from a very human tendency to confuse fact and theory. So I have gone out of my way to demonstrate what is empirically true in the data – facts – irrespective of theoretical interpretation (MOND or otherwise).

What is empirically true, and now observationally established beyond a reasonable doubt, is that the mass discrepancy in rotating galaxies correlates with centripetal acceleration. The lower the acceleration, the more dark matter one appears to need. Or, as Bekenstein might have put it, the amplitude of the acceleration discrepancy grows as the acceleration itself declines.

Bob Sanders made the first empirical demonstration that I am aware of that the mass discrepancy correlates with acceleration. In a wide ranging and still relevant 1990 review, he showed that the amplitude of the mass discrepancy correlated with the acceleration at the last measured point of a rotation curve. It did not correlate with radius.

I was completely unaware of this when I became interested in the problem a few years later. I wound up reinventing the very same term – the mass discrepancy, which I defined as the ratio of dynamically measured mass to that visible in baryons: D = Mtot/Mbar. When there is no dark matter, Mtot = Mbar and D = 1.

My first demonstration of this effect was presented at a conference at Rutgers in 1998. This considered the mass discrepancy at every radius and every acceleration within all the galaxies that were available to me at that time. Though messy, as is often the case in extragalactic astronomy, the correlation was clear. Indeed, this was part of a broader review of galaxy formation; the title, abstract, and much of the substance remains relevant today.

I spent much of the following five years collecting more data, refining the analysis, and sweating the details of uncertainties and systematic instrumental effects. In 2004, I published an extended and improved version, now with over 5 dozen galaxies.

Here I’ve used a population synthesis model to estimate the mass-to-light ratio of the stars. This is the only unknown; everything else is measured. Note that the vast majority galaxies land on top of each other. There are a few that do not, as you can perceive in the parallel sets of points offset from the main body. But that happens in only a few cases, as expected – no population model is perfect. Indeed, this one was surprisingly good, as the vast majority of the individual galaxies are indistinguishable in the pile that defines the main relation.

I explored the how the estimation of the stellar mass-to-light ratio affected this mass discrepancy-acceleration relation in great detail in the 2004 paper. The details differ with the choice of estimator, but the bottom line was that the relation persisted for any plausible choice. The relation exists. It is an empirical fact.

At this juncture, further improvement was no longer limited by rotation curve data, which is what we had been working to expand through the early ’00s. Now it was the stellar mass. The measurement of stellar mass was based on optical measurements of the luminosity distribution of stars in galaxies. These are perfectly fine data, but it is hard to map the starlight that we measured to the stellar mass that we need for this relation. The population synthesis models were good, but they weren’t good enough to avoid the occasional outlier, as can be seen in the figure above.

One thing the models all agreed on (before they didn’t, then they did again) was that the near-infrared would provide a more robust way of mapping stellar mass than the optical bands we had been using up till then. This was the clear way forward, and perhaps the only hope for improving the data further. Fortunately, technology was keeping pace. Around this time, I became involved in helping the effort to develop the NEWFIRM near-infrared camera for the national observatories, and NASA had just launched the Spitzer space telescope. These were the right tools in the right place at the right time. Ultimately, the high accuracy of the deep images obtained from the dark of space by Spitzer at 3.6 microns were to prove most valuable.

Jim Schombert and I spent much of the following decade observing in the near-infrared. Many other observers were doing this as well, filling the Spitzer archive with useful data while we concentrated on our own list of low surface brightness galaxies. This paragraph cannot suffice to convey the long term effort and enormity of this program. But by the mid-teens, we had accumulated data for hundreds of galaxies, including all those for which we also had rotation curves and HI observations. The latter had been obtained over the course of decades by an entire independent community of radio observers, and represent an integrated effort that dwarfs our own.

On top of the observational effort, Jim had been busy building updated stellar population models. We have a sophisticated understanding of how stars work, but things can get complicated when you put billions of them together. Nevertheless, Jim’s work – and that of a number of independent workers – indicated that the relation between Spitzer’s 3.6 micron luminosity measurements and stellar mass should be remarkably simple – basically just a constant conversion factor for nearly all star forming galaxies like those in our sample.

Things came together when Federico Lelli joined Case Western as a postdoc in 2014. He had completed his Ph.D. in the rich tradition of radio astronomy, and was the perfect person to move the project forward. After a couple more years of effort, curating the rotation curve data and building mass models from the Spitzer data, we were in the position to build the relation for over a dozen dozen galaxies. With all the hard work done, making the plot was a matter of running a pre-prepared computer script.

Federico ran his script. The plot appeared on his screen. In a stunned voice, he called me into his office. We had expected an improvement with the Spitzer data – hence the decade of work – but we had also expected there to be a few outliers. There weren’t. Any.

All. the. galaxies. fell. right. on. top. of. each. other.

This plot differs from those above because we had decided to plot the measured acceleration against that predicted by the observed baryons so that the two axes would be independent. The discrepancy, defined as the ratio, depended on both. D is essentially the ratio of the y-axis to the x-axis of this last plot, dividing out the unity slope where D = 1.

This was one of the most satisfactory moments of my long career, in which I have been fortunate to have had many satisfactory moments. It is right up there with the eureka moment I had that finally broke the long-standing loggerhead about the role of selection effects in Freeman’s Law. (Young astronomers – never heard of Freeman’s Law? You’re welcome.) Or the epiphany that, gee, maybe what we’re calling dark matter could be a proxy for something deeper. It was also gratifying that it was quickly recognized as such, with many of the colleagues I first presented it to saying it was the highlight of the conference where it was first unveiled.

Regardless of the ultimate interpretation of the radial acceleration relation, it clearly exists in the data for rotating galaxies. The discrepancy appears at a characteristic acceleration scale, g = 1.2 x 10-10 m/s/s. That number is in the data. Why? is a deeply profound question.

It isn’t just that the acceleration scale is somehow fundamental. The amplitude of the discrepancy depends systematically on the acceleration. Above the critical scale, all is well: no need for dark matter. Below it, the amplitude of the discrepancy – the amount of dark matter we infer – increases systematically. The lower the acceleration, the more dark matter one infers.

The relation for rotating galaxies has no detectable scatter – it is a near-perfect relation. Whether this persists, and holds for other systems, is the interesting outstanding question. It appears, for example, that dwarf spheroidal galaxies may follow a slightly different relation. However, the emphasis here is on slighlty. Very few of these data pass the same quality criteria that the SPARC data plotted above do. It’s like comparing mud pies with diamonds.

Whether the scatter in the radial acceleration relation is zero or merely very tiny is important. That’s the difference between a new fundamental force law (like MOND) and a merely spectacular galaxy scaling relation. For this reason, it seems to be controversial. It shouldn’t be: I was surprised at how tight the relation was myself. But I don’t get to report that there is lots of scatter when there isn’t. To do so would be profoundly unscientific, regardless of the wants of the crowd.

Of course, science is hard. If you don’t do everything right, from the measurements to the mass models to the stellar populations, you’ll find some scatter where perhaps there isn’t any. There are so many creative ways to screw up that I’m sure people will continue to find them. Myself, I prefer to look forward: I see no need to continuously re-establish what has been repeatedly demonstrated in the history briefly outlined above.