Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Like the Milky Way, our nearest giant neighbor, Andromeda (aka M31), has several dozen dwarf satellite galaxies. A few of these were known and had measured velocity dispersions at the time of my work with Joe Wolf, as discussed previously. Also like the Milky Way, the number of known objects has grown rapidly in recent years – thanks in this case largely to the PAndAS survey.

PAndAS imaged the area around M31 and M33, finding many individual red giant stars. These trace out the debris from interactions and mergers as small dwarfs are disrupted and consumed by their giant host. They also pointed up the existence of previously unknown dwarf satellites.

M31fromPANDAS_ McC2012_EPJ_19_01003
The PAndAS survey field. Dwarf satellites are circled.

As the PAndAS survey started reporting the discovery of new dwarf satellites around Andromeda, it occurred to me that this provided the opportunity to make genuine a priori predictions. These are the gold standard of the scientific method. We could use the observed luminosity and size of the newly discovered dwarfs to predict their velocity dispersions.

I tried to do this for both ΛCDM and MOND. I will not discuss the ΛCDM case much, because it can’t really be done. But it is worth understanding why this is.

In ΛCDM, the velocity dispersion is determined by the dark matter halo. This has only a tenuous connection to the observed stars, so just knowing how big and bright a dwarf is doesn’t provide much predictive power about the halo. This can be seen from this figure by Tollerud et al (2011):

Virial mass of the dark matter halo as a function of galaxy luminosity. Dwarfs satellites reside in the wide colored band of low luminosities.

This graph is obtained by relating the number density of galaxies (an observed quantity) to that of the dark matter halos in which they reside (a theoretical construct). It is highly non-linear, deviating strongly from the one-to-one line we expected early on. There is no reason to expect this particular relation; it is imposed on us by the fact that the observed luminosity function of galaxies is rather flat while the predicted halo mass function is steep. Nowadays, this is usually called the missing satellite problem, but this is a misnomer as it pervades the field.

Addressing the missing satellites problem would be another long post, so lets just accept that the relation between mass and light has to follow something like that illustrated above. If a dwarf galaxy has a luminosity of a million suns, one can read off the graph that it should live in a dark halo with a mass of about 1010 M. One could use this to predict the velocity dispersion, but not very precisely, because there’s a big range corresponding to that luminosity (the bands in the figure). It could be as much as 1011 M or as little as 109 M. This corresponds to a wide range of velocity dispersions. This wide range is unavoidable because of the difference in the luminosity function and halo mass function. Small variations in one lead to big variations in the other, and some scatter in dark halo properties is unavoidable.

Consequently, we only have a vague range of expected velocity dispersions in ΛCDM. In practice, we never make this prediction. Instead, we compare the observed velocity dispersion to the luminosity and say “gee, this galaxy has a lot of dark matter” or “hey, this one doesn’t have much dark matter.” There’s no rigorously testable prior.

In MOND, what you see is what you get. The velocity dispersion has to follow from the observed stellar mass. This is straightforward for isolated galaxies: M* ∝ σ4 – this is essentially the equivalent of the Tully-Fisher relation for pressure supported systems. If we can estimate the stellar mass from the observed luminosity, the predicted velocity dispersion follows.

Many dwarf satellites are not isolated in the MONDian sense: they are subject to the external field effect (EFE) from their giant hosts. The over-under for whether the EFE applies is the point when the internal acceleration from all the stars of the dwarf on each other is equal to the external acceleration from orbiting the giant host. The amplitude of the discrepancy in MOND depends on how low the total acceleration is relative to the critical scale a0. The external field in effect adds some acceleration that wouldn’t otherwise be there, making the discrepancy less than it would be for an isolated object. This means that two otherwise identical dwarfs may be predicted to have different velocity dispersions is they are or are not subject to the EFE. This is a unique prediction of MOND that has no analog in ΛCDM.

It is straightforward to derive the equation to predict velocity dispersions in the extreme limits of isolated (aex ≪ ain < a0) or EFE dominated (ain ≪ aex < a0) objects. In reality, there are many objects for which ain ≈ aex, and no simple formula applies. In practice, we apply the formula that more nearly applies, and pray that this approximation is good enough.

There are many other assumptions and approximations that must be made in any theory: that an object is spherical, isotropic, and in dynamical equilibrium. All of these must fail at some level, but it is the last one that is the most serious concern. In the case of the EFE, one must also make the approximation that the object is in equilibrium at the current level of the external field. That is never true, as both the amplitude and the vector of the external field vary as a dwarf orbits its host. But it might be an adequate approximation if this variation is slow. In the case of a circular orbit, only the vector varies. In general the orbits are not known, so we make the instantaneous approximation and once again pray that it is good enough. There is a fairly narrow window between where the EFE becomes important and where we slip into the regime of tidal disruption, but lets plow ahead and see how far we can get, bearing in mind that the EFE is a dynamical variable of which we only have a snapshot.

To predict the velocity dispersion in the isolated case, all we need to know is the luminosity and a stellar mass-to-light ratio. Assuming the dwarfs of Andromeda to be old stellar populations, I adopted a V-band mass-to-light ratio of 2 give or take a factor of 2. That usually dominates the uncertainty, though the error in the distance can sometimes impact the luminosity at a level that impacts the prediction.

To predict the velocity dispersion in the EFE case, we again need the stellar mass, but now also need to know the size of the stellar system and the intensity of the external field to which it is subject. The latter depends on the mass of the host galaxy and the distance from it to the dwarf. This latter quantity is somewhat fraught: it is straightforward to measure the projected distance on the sky, but we need the 3D distance – how far in front or behind each dwarf is as well as its projected distance from the host. This is often a considerable contributor to the error budget. Indeed, some dwarfs may be inferred to be in the EFE regime for the low end of the range of adopted stellar mass-to-light ratio, and the isolated regime for the high end.

In this fashion, we predicted velocity dispersions for the dwarfs of Andromeda. We in this case were Milgrom and myself. I had never collaborated with him before, and prefer to remain independent. But I also wanted to be sure I got the details described above right. Though it wasn’t much work to make the predictions once the preliminaries were established, it was time consuming to collect and vet the data. As we were writing the paper, velocity dispersion measurements started to appear. People like Michelle Collins, Erik Tollerud, and Nicolas Martin were making follow-up observations, and publishing velocity dispersion for the objects we were making predictions for. That was great, but they were too good – they were observing and publishing faster than we could write!

Nevertheless, we managed to make and publish a priori predictions for 10 dwarfs before any observational measurements were published. We also made blind predictions for the other known dwarfs of Andromeda, and checked the predicted velocity dispersions against all measurements that we could find in the literature. Many of these predictions were quickly tested by on-going programs (i.e., people were out to measure velocity dispersions, whether we predicted them or not). Enough data rolled in that we were soon able to write a follow-up paper testing our predictions.

Nailed it. Good data were soon available to test the predictions for 8 of the 10* a priori cases. All 8 were consistent with our predictions. I was particularly struck by the case of And XXVIII, which I had called out as perhaps the best test. It was isolated, so the messiness of the EFE didn’t apply, and the uncertainties were low. Moreover, the predicted velocity dispersion was low – a good deal lower than broadly expected in ΛCDM: 4.3 km/s, with an uncertainty just under 1 km/s. Two independent observations were subsequently reported. One found 4.9 ± 1.6 km/s, the other 6.6 ± 2.1 km/s, both in good agreement within the uncertainties.

We made further predictions in the second paper as people had continued to discover new dwarfs. These also came true. Here is a summary plot for all of the dwarfs of Andromeda:

The velocity dispersions of the dwarf satellites of Andromeda. Each numbered box corresponds to one dwarf (x=1 is for And I and so on). Measured velocity dispersions have a number next to them that is the number of stars on which the measurement is based. MOND predictions are circles: green if isolated, open if the EFE applies. Points appear within each box in the order they appeared in the literature, from left to right. The vast majority of Andromeda’s dwarfs are consistent with MOND (large green circles). Two cases are ambiguous (large yellow circles), having velocity dispersions based only a few stars. Only And V appears to be problematic (large red circle).

MOND works well for And I, And II, And III, And VI, And VII, And IX, And X, And XI, And XII, And XIII, And XIV, And XV, And XVI, And XVII, And XVIII, And XIX, And XX, And XXI, And XXII, And XXIII, And XXIV, And XXV, And XXVIII, And XXIX, And XXXI, And XXXII, and And XXXIII. There is one problematic case: And V. I don’t know what is going on there, but note that systematic errors frequently happen in astronomy. It’d be strange if there weren’t at least one goofy case.

Nevertheless, the failure of And V could be construed as a falsification of MOND. It ought to work in every single case. But recall the discussion of assumptions and uncertainties above. Is falsification really the story these data tell?

We do have experience with various systematic errors. For example, we predicted that the isolated dwarfs spheroidal Cetus should have a velocity dispersion in MOND of 8.2 km/s. There was already a published measurement of 17 ± 2 km/s, so we reported that MOND was wrong in this case by over 3σ. Or at least we started to do so. Right before we submitted that paper, a new measurement appeared: 8.3 ± 1 km/s. This is an example of how the data can sometimes change by rather more than the formal error bars suggest is possible. In this case, I suspect the original observations lacked the spectral resolution to resolve the velocity dispersion. At any rate, the new measurement (8.3 km/s) was somewhat more consistent with our prediction (8.2 km/s).

The same predictions cannot even be made in ΛCDM. The velocity data can always be fit once they are in hand. But there is no agreed method to predict the velocity dispersion of a dwarf from its observed luminosity. As discussed above, this should not even be possible: there is too much scatter in the halo mass-stellar mass relation at these low masses.

An unsung predictive success of MOND absent from the graph above is And IV. When And IV was discovered in the general direction of Andromeda, it was assumed to be a new dwarf satellite – hence the name. Milgrom looked at the velocities reported for this object, and said it had to be a background galaxy. No way it could be a dwarf satellite – at least not in MOND. I see no reason why it couldn’t have been in ΛCDM. It is absent from the graph above, because it was subsequently confirmed to be much farther away (7.2 Mpc vs. 750 kpc for Andromeda).

The box for And XVII is empty because this system is manifestly out of equilibrium. It is more of a stellar stream than a dwarf, appearing as a smear in the PAndAS image rather than as a self-contained dwarf. I do not recall what the story with the other missing object (And VIII) is.

While writing the follow-up paper, I also noticed that there were a number of Andromeda dwarfs that were photometrically indistinguishable: basically the same in terms of size and stellar mass. But some were isolated while others were subject to the EFE. MOND predicts that the EFE cases should have lower velocity dispersion than the isolated equivalents.

The velocity dispersions of the dwarfs of Andromeda, highlighting photometrically matched pairs – dwarfs that should be indistinguishable, but aren’t because of the EFE.

And XXVIII (isolated) has a higher velocity dispersion than its near-twin And XVII (EFE). The same effect might be acting in And XVIII (isolated) and And XXV (EFE). This is clear if we accept the higher velocity dispersion measurement for And XVIII, but an independent measurements begs to differ. The former has more stars, so is probably more reliable, but we should be cautious. The effect is not clear in And XVI (isolated) and And XXI (EFE), but the difference in the prediction is small and the uncertainties are large.

An aggressive person might argue that the pairs of dwarfs is a positive detection of the EFE. I don’t think the data for the matched pairs warrant that, at least not yet. On the other hand, the appropriate use of the EFE was essential to all the predictions, not just the matched pairs.

The positive detection of the EFE is important, as it is a unique prediction of MOND. I see no way to tune ΛCDM galaxy simulations to mimic this effect. Of course, there was a  very recent time when it seemed impossible for them to mimic the isolated predictions of MOND. They claim to have come a long way in that regard.

But that’s what we’re stuck with: tuning ΛCDM to make it look like MOND. This is why a priori predictions are important. There is ample flexibility to explain just about anything with dark matter. What we can’t seem to do is predict the same things that MOND successfully predicts… predictions that are both quantitative and very specific. We’re not arguing that dwarfs in general live in ~15 or 30 km/s halos, as we must in ΛCDM. In MOND we can say this dwarf will have this velocity dispersion and that dwarf will have that velocity dispersion. We can distinguish between 4.9 and 7.3 km/s. And we can do it over and over and over. I see no way to do the equivalent in ΛCDM, just as I see no way to explain the acoustic power spectrum of the CMB in MOND.

This is not to say there are no problematic cases for MOND. Read, Walker, & Steger have recently highlighted the matched pair of Draco and Carina as an issue. And they are – though here I already have reason to suspect Draco is out of equilibrium, which makes it challenging to analyze. Whether it is actually out of equilibrium or not is a separate question.

I am not thrilled that we are obliged to invoke non-equilibrium effects in both theories. But there is a difference. Brada & Milgrom provided a quantitative criterion to indicate when this was an issue before I ran into the problem. In ΛCDM, the low velocity dispersions of objects like And XIX, XXI, XXV and Crater 2 came as a complete surprise despite having been predicted by MOND. Tidal disruption was only invoked after the fact – and in an ad hoc fashion. There is no way to know in advance which dwarfs are affected, as there is no criterion equivalent to that of Brada. We just say “gee, that’s a low velocity dispersion. Must have been disrupted.” That might be true, but it gives no explanation for why MOND predicted it in the first place – which is to say, it isn’t really an explanation at all.

I still do not understand is why MOND gets any predictions right if ΛCDM is the universe we live in, let alone so many. Shouldn’t happen. Makes no sense.

If this doesn’t confuse you, you are not thinking clearly.

*The other two dwarfs were also measured, but with only 4 stars in one and 6 in the other. These are too few for a meaningful velocity dispersion measurement.


Dwarf Satellite Galaxies. II. Non-equilibrium effects in ultrafaint dwarfs

Dwarf Satellite Galaxies. II. Non-equilibrium effects in ultrafaint dwarfs

I have been wanting to write about dwarf satellites for a while, but there is so much to tell that I didn’t think it would fit in one post. I was correct. Indeed, it was worse than I thought, because my own experience with low surface brightness (LSB) galaxies in the field is a necessary part of the context for my perspective on the dwarf satellites of the Local Group. These are very different beasts – satellites are pressure supported, gas poor objects in orbit around giant hosts, while field LSB galaxies are rotating, gas rich galaxies that are among the most isolated known. However, so far as their dynamics are concerned, they are linked by their low surface density.

Where we left off with the dwarf satellites, circa 2000, Ursa Minor and Draco remained problematic for MOND, but the formal significance of these problems was not great. Fornax, which had seemed more problematic, was actually a predictive success: MOND returned a low mass-to-light ratio for Fornax because it was full of young stars. The other known satellites, Carina, Leo I, Leo II, Sculptor, and Sextans, were all consistent with MOND.

The Sloan Digital Sky Survey resulted in an explosion in the number of satellites galaxies discovered around the Milky Way. These were both fainter and lower surface brightness than the classical dwarfs named above. Indeed, they were often invisible as objects in their own right, being recognized instead as groupings of individual stars that shared the same position in space and – critically – velocity. They weren’t just in the same place, they were orbiting the Milky Way together. To give short shrift to a long story, these came to be known as ultrafaint dwarfs.

Ultrafaint dwarf satellites have fewer than 100,000 stars. That’s tiny for a stellar system. Sometimes they had only a few hundred. Most of those stars are too faint to see directly. Their existence is inferred from a handful of red giants that are actually observed. Where there are a few red giants orbiting together, there must be a source population of fainter stars. This is a good argument, and it is likely true in most cases. But the statistics we usually rely on become dodgy for such small numbers of stars: some of the ultrafaints that have been reported in the literature are probably false positives. I have no strong opinion on how many that might be, but I’d be really surprised if it were zero.

Nevertheless, assuming the ultrafaints dwarfs are self-bound galaxies, we can ask the same questions as before. I was encouraged to do this by Joe Wolf, a clever grad student at UC Irvine. He had a new mass estimator for pressure supported dwarfs that we decided to apply to this problem. We used the Baryonic Tully-Fisher Relation (BTFR) as a reference, and looked at it every which-way. Most of the text is about conventional effects in the dark matter picture, and I encourage everyone to read the full paper. Here I’m gonna skip to the part about MOND, because that part seems to have been overlooked in more recent commentary on the subject.

For starters, we found that the classical dwarfs fall along the extrapolation of the BTFR, but the ultrafaint dwarfs deviate from it.

Fig. 1 from McGaugh & Wolf (2010, annotated). The BTFR defined by rotating galaxies (gray points) extrapolates well to the scale of the dwarf satellites of the Local Group (blue points are the classical dwarf satellites of the Milky Way; red points are satellites of Andromeda) but not to the ultrafaint dwarfs (green points). Two of the classical dwarfs also fall off of the BTFR: Draco and Ursa Minor.

The deviation is not subtle, at least not in terms of mass. The ultrataints had characteristic circular velocities typical of systems 100 times their mass! But the BTFR is steep. In terms of velocity, the deviation is the difference between the 8 km/s typically observed, and the ~3 km/s needed to put them on the line. There are a large number of systematic effects errors that might arise, and all act to inflate the characteristic velocity. See the discussion in the paper if you’re curious about such effects; for our purposes here we will assume that the data cannot simply be dismissed as the result of systematic errors, though one should bear in mind that they probably play a role at some level.

Taken at face value, the ultrafaint dwarfs are a huge problem for MOND. An isolated system should fall exactly on the BTFR. These are not isolated systems, being very close to the Milky Way, so the external field effect (EFE) can cause deviations from the BTFR. However, these are predicted to make the characteristic internal velocities lower than the isolated case. This may in fact be relevant for the red points that deviate a bit in the plot above, but we’ll return to that at some future point. The ultrafaints all deviate to velocities that are too high, the opposite of what the EFE predicts.

The ultrafaints falsify MOND! When I saw this, all my original confirmation bias came flooding back. I had pursued this stupid theory to ever lower surface brightness and luminosity. Finally, I had found where it broke. I felt like Darth Vader in the original Star Wars:

I have you now!

The first draft of my paper with Joe included a resounding renunciation of MOND. No way could it escape this!


I had this nagging feeling I was missing something. Darth should have looked over his shoulder. Should I?

Surely I had missed nothing. Many people are unaware of the EFE, just as we had been unaware that Fornax contained young stars. But not me! I knew all that. Surely this was it.

Nevertheless, the nagging feeling persisted. One part of it was sociological: if I said MOND was dead, it would be well and truly buried. But did it deserve to be? The scientific part of the nagging feeling was that maybe there had been some paper that addressed this, maybe a decade before… perhaps I’d better double check.

Indeed, Brada & Milgrom (2000) had run numerical simulations of dwarf satellites orbiting around giant hosts. MOND is a nonlinear dynamical theory; not everything can be approximated analytically. When a dwarf satellite is close to its giant host, the external acceleration of the dwarf falling towards its host can exceed the internal acceleration of the stars in the dwarf orbiting each other – hence the EFE. But the EFE is not a static thing; it varies as the dwarf orbits about, becoming stronger on closer approach. At some point, this variation becomes to fast for the dwarf to remain in equilibrium. This is important, because the assumption of dynamical equilibrium underpins all these arguments. Without it, it is hard to know what to expect short of numerically simulating each individual dwarf. There is no reason to expect them to remain on the equilibrium BTFR.

Brada & Milgrom suggested a measure to gauge the extent to which a dwarf might be out of equilibrium. It boils down to a matter of timescales. If the stars inside the dwarf have time to adjust to the changing external field, a quasi-static EFE approximation might suffice. So the figure of merit becomes the ratio of internal orbits per external orbit. If the stars inside a dwarf are swarming around many times for every time it completes an orbit around the host, then they have time to adjust. If the orbit of the dwarf around the host is as quick as the internal motions of the stars within the dwarf, not so much. At some point, a satellite becomes a collection of associated stars orbiting the host rather than a self-bound object in its own right.

Deviations from the BTFR (left) and the isophotal shape of dwarfs (right) as a function of the number of internal orbits a star at the half-light radius makes for every orbit a dwarf makes around its giant host (Fig. 7 of McGaugh & Wolf 2010).

Brada & Milgrom provide the formula to compute the ratio of orbits, shown in the figure above. The smaller the ratio, the less chance an object has to adjust, and the more subject it is to departures from equilibrium. Remarkably, the amplitude of deviation from the BTFR – the problem I could not understand initially – correlates with the ratio of orbits. The more susceptible a dwarf is to disequilibrium effects, the farther it deviated from the BTFR.

This completely inverted the MOND interpretation. Instead of falsifying MOND, the data now appeared to corroborate the non-equilibrium prediction of Brada & Milgrom. The stronger the external influence, the more a dwarf deviated from the equilibrium expectation. In conventional terms, it appeared that the ultrafaints were subject to tidal stirring: their internal velocities were being pumped up by external influences. Indeed, the originally problematic cases, Draco and Ursa Minor, fall among the ultrafaint dwarfs in these terms. They can’t be in equilibrium in MOND.

If the ultrafaints are out of equilibrium, the might show some independent evidence of this. Stars should leak out, distorting the shape of the dwarf and forming tidal streams. Can we see this?

A definite maybe:

The shapes of some ultrafaint dwarfs. These objects are so diffuse that they are invisible on the sky; their shape is illustrated by contours or heavily smoothed grayscale pseudo-images.

The dwarfs that are more subject to external influence tend to be more elliptical in shape. A pressure supported system in equilibrium need not be perfectly round, but one departing from equilibrium will tend to get stretched out. And indeed, many of the ultrafaints look Messed Up.

I am not convinced that all this requires MOND. But it certainly doesn’t falsify it. Tidal disruption can happen in the dark matter context, but it happens differently. The stars are buried deep inside protective cocoons of dark matter, and do not feel tidal effects much until most of the dark matter is stripped away. There is no reason to expect the MOND measure of external influence to apply (indeed, it should not), much less that it would correlate with indications of tidal disruption as seen above.

This seems to have been missed by more recent papers on the subject. Indeed, Fattahi et al. (2018) have reconstructed very much the chain of thought I describe above. The last sentence of their abstract states “In many cases, the resulting velocity dispersions are inconsistent with the predictions from Modified Newtonian Dynamics, a result that poses a possibly insurmountable challenge to that scenario.” This is exactly what I thought. (I have you now.) I was wrong.

Fattahi et al. are wrong for the same reasons I was wrong. They are applying equilibrium reasoning to a non-equilibrium situation. Ironically, the main point of the their paper is that many systems can’t be explained with dark matter, unless they are tidally stripped – i.e., the result of a non-equilibrium process. Oh, come on. If you invoke it in one dynamical theory, you might want to consider it in the other.

To quote the last sentence of our abstract from 2010, “We identify a test to distinguish between the ΛCDM and MOND based on the orbits of the dwarf satellites of the Milky Way and how stars are lost from them.” In ΛCDM, the sub-halos that contain dwarf satellites are expected to be on very eccentric orbits, with all the damage from tidal interactions with the host accruing during pericenter passage. In MOND, substantial damage may accrue along lower eccentricity orbits, leading to the expectation of more continuous disruption.

Gaia is measuring proper motions for stars all over the sky. Some of these stars are in the dwarf satellites. This has made it possible to estimate orbits for the dwarfs, e.g., work by Amina Helmi (et al!) and Josh Simon. So far, the results are definitely mixed. There are more dwarfs on low eccentricity orbits than I had expected in ΛCDM, but there are still plenty that are on high eccentricity orbits, especially among the ultrafaints. Which dwarfs have been tidally affected by interactions with their hosts is far from clear.

In short, reality is messy. It is going to take a long time to sort these matters out. These are early days.

Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

The Milky Way and its nearest giant neighbor Andromeda (M31) are surrounded by a swarm of dwarf satellite galaxies. Aside from relatively large beasties like the Large Magellanic Cloud or M32, the majority of these are the so-called dwarf spheroidals. There are several dozen examples known around each giant host, like the Fornax dwarf pictured above.

Dwarf Spheroidal (dSph) galaxies are ellipsoidal blobs devoid of gas that typically contain a million stars, give or take an order of magnitude. Unlike globular clusters, that may have a similar star count, dSphs are diffuse, with characteristic sizes of hundreds of parsecs (vs. a few pc for globulars). This makes them among the lowest surface brightness systems known.

This subject has a long history, and has become a major industry in recent years. In addition to the “classical” dwarfs that have been known for decades, there have also been many comparatively recent discoveries, often of what have come to be called “ultrafaint” dwarfs. These are basically dSphs with luminosities less than 100,000 suns, sometimes being comprised of as little as a few hundred stars. New discoveries are being made still, and there is reason to hope that the LSST will discover many more. Summed up, the known dwarf satellites are proverbial drops in the bucket compared to their giant hosts, which contain hundreds of billions of stars. Dwarfs could rain in for a Hubble time and not perturb the mass budget of the Milky Way.

Nevertheless, tiny dwarf Spheroidals are excellent tests of theories like CDM and MOND. Going back to the beginning, in the early ’80s, Milgrom was already engaged in a discussion about the predictions of his then-new theory (before it was even published) with colleagues at the IAS, where he had developed the idea during a sabbatical visit. They were understandably skeptical, preferring – as many still do – to believe that some unseen mass was the more conservative hypothesis. Dwarf spheroidals came up even then, as their very low surface brightness meant low acceleration in MOND. This in turn meant large mass discrepancies. If you could measure their dynamics, they would have large mass-to-light ratios. Larger than could be explained by stars conventionally, and larger than the discrepancies already observed in bright galaxies like Andromeda.

This prediction of Milgrom’s – there from the very beginning – is important because of how things change (or don’t). At that time, Scott Tremaine summed up the contrasting expectation of the conventional dark matter picture:

“There is no reason to expect that dwarfs will have more dark matter than bright galaxies.” *

This was certainly the picture I had in my head when I first became interested in low surface brightness (LSB) galaxies in the mid-80s. At that time I was ignorant of MOND; my interest was piqued by the argument of Disney that there could be a lot of as-yet undiscovered LSB galaxies out there, combined with my first observing experiences with the then-newfangled CCD cameras which seemed to have a proclivity for making clear otherwise hard-to-see LSB features. At the time, I was interested in finding LSB galaxies. My interest in what made them rotate came  later.

The first indication, to my knowledge, that dSph galaxies might have large mass discrepancies was provided by Marc Aaronson in 1983. This tentative discovery was hugely important, but the velocity dispersion of Draco (one of the “classical” dwarfs) was based on only 3 stars, so was hardly definitive. Nevertheless, by the end of the ’90s, it was clear that large mass discrepancies were a defining characteristic of dSphs. Their conventionally computed M/L went up systematically as their luminosity declined. This was not what we had expected in the dark matter picture, but was, at least qualitatively, in agreement with MOND.

My own interests had focused more on LSB galaxies in the field than on dwarf satellites like Draco. Greg Bothun and Jim Schombert had identified enough of these to construct a long list of LSB galaxies that served as targets my for Ph.D. thesis. Unlike the pressure-supported ellipsoidal blobs of stars that are the dSphs, the field LSBs we studied were gas rich, rotationally supported disks – mostly late type galaxies (Sd, Sm, & Irregulars). Regardless of composition, gas or stars, low surface density means that MOND predicts low acceleration. This need not be true conventionally, as the dark matter can do whatever the heck it wants. Though I was blissfully unaware of it at the time, we had constructed the perfect sample for testing MOND.

Having studied the properties of our sample of LSB galaxies, I developed strong ideas about their formation and evolution. Everything we had learned – their blue colors, large gas fractions, and low star formation rates – suggested that they evolved slowly compared to higher surface brightness galaxies. Star formation gradually sputtered along, having a hard time gathering enough material to make stars in their low density interstellar media. Perhaps they even formed late, an idea I took a shining to in the early ’90s. This made two predictions: field LSB galaxies should be less strongly clustered than bright galaxies, and should spin slower at a given mass.

The first prediction follows because the collapse time of dark matter halos correlates with their larger scale environment. Dense things collapse first and tend to live in dense environments. If LSBs were low surface density because they collapsed late, it followed that they should live in less dense environments.

I didn’t know how to test this prediction. Fortunately, fellow postdoc and office mate in Cambridge at the time, Houjun Mo, did. It came true. The LSB galaxies I had been studying were clustered like other galaxies, but not as strongly. This was exactly what I expected, and I thought sure we were on to something. All that remained was to confirm the second prediction.

At the time, we did not have a clear idea of what dark matter halos should be like. NFW halos were still in the future. So it seemed reasonable that late forming halos should have lower densities (lower concentrations in the modern terminology). More importantly, the sum of dark and luminous density was certainly less. Dynamics follow from the distribution of mass as Velocity2 ∝ Mass/Radius. For a given mass, low surface brightness galaxies had a larger radius, by construction. Even if the dark matter didn’t play along, the reduction in the concentration of the luminous mass should lower the rotation velocity.

Indeed, the standard explanation of the Tully-Fisher relation was just this. Aaronson, Huchra, & Mould had argued that galaxies obeyed the Tully-Fisher relation because they all had essentially the same surface brightness (Freeman’s law) thereby taking variation in the radius out of the equation: galaxies of the same mass all had the same radius. (If you are a young astronomer who has never heard of Freeman’s law, you’re welcome.) With our LSB galaxies, we had a sample that, by definition, violated Freeman’s law. They had large radii for a given mass. Consequently, they should have lower rotation velocities.

Up to that point, I had not taken much interest in rotation curves. In contrast, colleagues at the University of Groningen were all about rotation curves. Working with Thijs van der Hulst, Erwin de Blok, and Martin Zwaan, we set out to quantify where LSB galaxies fell in relation to the Tully-Fisher relation. I confidently predicted that they would shift off of it – an expectation shared by many at the time. They did not.

The Tully-Fisher relation: disk mass vs. flat rotation speed (circa 1996). Galaxies are binned by surface brightness with the highest surface brightness galaxies marked red and the lowest blue. The lines show the expected shift following the argument of Aaronson et al. Contrary to this expectation, galaxies of all surface brightnesses follow the same Tully-Fisher relation.

I was flummoxed. My prediction was wrong. That of Aaronson et al. was wrong. Poking about the literature, everyone who had made a clear prediction in the conventional context was wrong. It made no sense.

I spent months banging my head against the wall. One quick and easy solution was to blame the dark matter. Maybe the rotation velocity was set entirely by the dark matter, and the distribution of luminous mass didn’t come into it. Surely that’s what the flat rotation velocity was telling us? All about the dark matter halo?

Problem is, we measure the velocity where the luminous mass still matters. In galaxies like the Milky Way, it matters quite a lot. It does not work to imagine that the flat rotation velocity is set by some property of the dark matter halo alone. What matters to what we measure is the combination of luminous and dark mass. The luminous mass is important in high surface brightness galaxies, and progressively less so in lower surface brightness galaxies. That should leave some kind of mark on the Tully-Fisher relation, but it doesn’t.

Residuals from the Tully-Fisher relation as a function of size at a given mass. Compact galaxies are to the left, diffuse ones to the right. The red dashed line is what Newton predicts: more compact galaxies should rotate faster at a given mass. Fundamental physics? Tully-Fisher don’t care. Tully-Fisher don’t give a sh*t.

I worked long and hard to understand this in terms of dark matter. Every time I thought I had found the solution, I realized that it was a tautology. Somewhere along the line, I had made an assumption that guaranteed that I got the answer I wanted. It was a hopeless fine-tuning problem. The only way to satisfy the data was to have the dark matter contribution scale up as that of the luminous mass scaled down. The more stretched out the light, the more compact the dark – in exact balance to maintain zero shift in Tully-Fisher.

This made no sense at all. Over twenty years on, I have yet to hear a satisfactory conventional explanation. Most workers seem to assert, in effect, that “dark matter does it” and move along. Perhaps they are wise to do so.

Working on the thing can drive you mad.

As I was struggling with this issue, I happened to hear a talk by Milgrom. I almost didn’t go. “Modified gravity” was in the title, and I remember thinking, “why waste my time listening to that nonsense?” Nevertheless, against my better judgement, I went. Not knowing that anyone in the audience worked on either LSB galaxies or Tully-Fisher, Milgrom proceeded to derive the MOND prediction:

“The asymptotic circular velocity is determined only by the total mass of the galaxy: Vf4 = a0GM.”

In a few lines, he derived rather trivially what I had been struggling to understand for months. The lack of surface brightness dependence in Tully-Fisher was entirely natural in MOND. It falls right out of the modified force law, and had been explicitly predicted over a decade before I struggled with the problem.

I scraped my jaw off the floor, determined to examine this crazy theory more closely. By the time I got back to my office, cognitive dissonance had already started to set it. Couldn’t be true. I had more pressing projects to complete, so I didn’t think about it again for many moons.

When I did, I decided I should start by reading the original MOND papers. I was delighted to find a long list of predictions, many of them specifically to do with surface brightness. We had just collected fresh data on LSB galaxies, which provided a new window on the low acceleration regime. I had the data to finally falsify this stupid theory.

Or so I thought. As I went through the list of predictions, my assumption that MOND had to be wrong was challenged by each item. It was barely an afternoon’s work: check, check, check. Everything I had struggled for months to understand in terms of dark matter tumbled straight out of MOND.

I was faced with a choice. I knew this would be an unpopular result. I could walk away and simply pretend I had never run across it. That’s certainly how it had been up until then: I had been blissfully unaware of MOND and its perniciously successful predictions. No need to admit otherwise.

Had I realized just how unpopular it would prove to be, maybe that would have been the wiser course. But even contemplating such a course felt criminal. I was put in mind of Paul Gerhardt’s admonition for intellectual honesty:

“When a man lies, he murders some part of the world.”

Ignoring what I had learned seemed tantamount to just that. So many predictions coming true couldn’t be an accident. There was a deep clue here; ignoring it wasn’t going to bring us closer to the truth. Actively denying it would be an act of wanton vandalism against the scientific method.

Still, I tried. I looked long and hard for reasons not to report what I had found. Surely there must be some reason this could not be so?

Indeed, the literature provided many papers that claimed to falsify MOND. To my shock, few withstood critical examination. Commonly a straw man representing MOND was falsified, not MOND itself. At a deeper level, it was implicitly assumed that any problem for MOND was an automatic victory for dark matter. This did not obviously follow, so I started re-doing the analyses for both dark matter and MOND. More often than not, I found either that the problems for MOND were greatly exaggerated, or that the genuinely problematic cases were a problem for both theories. Dark matter has more flexibility to explain outliers, but outliers happen in astronomy. All too often the temptation was to refuse to see the forest for a few trees.

The first MOND analysis of the classical dwarf spheroidals provides a good example. Completed only a few years before I encountered the problem, these were low surface brightness systems that were deep in the MOND regime. These were gas poor, pressure supported dSph galaxies, unlike my gas rich, rotating LSB galaxies, but the critical feature was low surface brightness. This was the most directly comparable result. Better yet, the study had been made by two brilliant scientists (Ortwin Gerhard & David Spergel) whom I admire enormously. Surely this work would explain how my result was a mere curiosity.

Indeed, reading their abstract, it was clear that MOND did not work for the dwarf spheroidals. Whew: LSB systems where it doesn’t work. All I had to do was figure out why, so I read the paper.

As I read beyond the abstract, the answer became less and less clear. The results were all over the map. Two dwarfs (Sculptor and Carina) seemed unobjectionable in MOND. Two dwarfs (Draco and Ursa Minor) had mass-to-light ratios that were too high for stars, even in MOND. That is, there still appeared to be a need for dark matter even after MOND had been applied. One the flip side, Fornax had a mass-to-light ratio that was too low for the old stellar populations assumed to dominate dwarf spheroidals. Results all over the map are par for the course in astronomy, especially for a pioneering attempt like this. What were the uncertainties?

Milgrom wrote a rebuttal. By then, there were measured velocity dispersions for two more dwarfs. Of these seven dwarfs, he found that

“within just the quoted errors on the velocity dispersions and the luminosities, the MOND M/L values for all seven dwarfs are perfectly consistent with stellar values, with no need for dark matter.”

Well, he would say that, wouldn’t he? I determined to repeat the analysis and error propagation.

Mass-to-light ratios determined with MOND for eight dwarf spheroidals (named, as published in McGaugh & de Blok 1998). The various symbols refer to different determinations. Mine are the solid circles. The dashed lines show the plausible range for stellar populations.

The net result: they were both right. M/L was still too high for Draco and Ursa Minor, and still too low for Fornax. But this was only significant at the 2σ level, if that – hardly enough to condemn a theory. Carina, Leo I, Leo II, Sculptor, and Sextans all had fairly reasonable mass-to-light ratios. The voting is different now. Instead of going 2 for 5 as Gerhard & Spergel found, MOND was now 5 for 8. One could choose to obsess about the outliers, or one could choose to see a more positive pattern.  Either a positive or a negative spin could be put on this result. But it was clearly more positive than the first attempt had indicated.

The mass estimator in MOND scales as the fourth power of velocity (or velocity dispersion in the case of isolated dSphs), so the too-high M*/L of Draco and Ursa Minor didn’t disturb me too much. A small overestimation of the velocity dispersion would lead to a large overestimation of the mass-to-light ratio. Just about every systematic uncertainty one can think of pushes in this direction, so it would be surprising if such an overestimate didn’t happen once in a while.

Given this, I was more concerned about the low M*/L of Fornax. That was weird.

Up until that point (1998), we had been assuming that the stars in dSphs were all old, like those in globular clusters. That corresponds to a high M*/L, maybe 3 in solar units in the V-band. Shortly after this time, people started to look closely at the stars in the classical dwarfs with the Hubble. Low and behold, the stars in Fornax were surprisingly young. That means a low M*/L, 1 or less. In retrospect, MOND was trying to tell us that: it returned a low M*/L for Fornax because the stars there are young. So what was taken to be a failing of the theory was actually a predictive success.


And Gee. This is a long post. There is a lot more to tell, but enough for now.

*I have a long memory, but it is not perfect. I doubt I have the exact wording right, but this does accurately capture the sentiment from the early ’80s when I was an undergraduate at MIT and Scott Tremaine was on the faculty there.

The next cosmic frontier: 21cm absorption at high redshift

The next cosmic frontier: 21cm absorption at high redshift

There are two basic approaches to cosmology: start at redshift zero and work outwards in space, or start at the beginning of time and work forward. The latter approach is generally favored by theorists, as much of the physics of the early universe follows a “clean” thermal progression, cooling adiabatically as it expands. The former approach is more typical of observers who start with what we know locally and work outwards in the great tradition of Hubble, Sandage, Tully, and the entire community of extragalactic observers that established the paradigm of the expanding universe and measured its scale. This work had established our current concordance cosmology, ΛCDM, by the mid-90s.*

Both approaches have taught us an enormous amount. Working forward in time, we understand the nucleosynthesis of the light elements in the first few minutes, followed after a few hundred thousand years by the epoch of recombination when the universe transitioned from an ionized plasma to a neutral gas, bequeathing us the cosmic microwave background (CMB) at the phenomenally high redshift of z=1090. Working outwards in redshift, large surveys like Sloan have provided a detailed map of the “local” cosmos, and narrower but much deeper surveys provide a good picture out to z = 1 (when the universe was half its current size, and roughly half its current age) and beyond, with the most distant objects now known above redshift 7, and maybe even at z > 11. JWST will provide a good view of the earliest (z ~ 10?) galaxies when it launches.

This is wonderful progress, but there is a gap from 10 < z < 1000. Not only is it hard to observe objects so distant that z > 10, but at some point they shouldn’t exist. It takes time to form stars and galaxies and the supermassive black holes that fuel quasars, especially when starting from the smooth initial condition seen in the CMB. So how do we probe redshifts z > 10?

It turns out that the universe provides a way. As photons from the CMB traverse the neutral intergalactic medium, they are subject to being absorbed by hydrogen atoms – particularly by the 21cm spin-flip transition. Long anticipated, this signal has recently been detected by the EDGES experiment. I find it amazing that the atomic physics of the early universe allows for this window of observation, and that clever scientists have figured out a way to detect this subtle signal.

So what is going on? First, a mental picture. In the image below, an observer at the left looks out to progressively higher redshift towards the right. The history of the universe unfolds from right to left.

An observer’s view of the history of the universe. Nearby, at low redshift, we see mostly empty space sprinkled with galaxies. At some high redshift (z ~ 20?), the first stars formed, flooding the previously dark universe with UV photons that reionize the gas of the intergalactic medium. The backdrop of the CMB provides the ultimate limit to electromagnetic observations as it marks the boundary (at z = 1090) between a mostly transparent and completely opaque universe.

Pritchard & Loeb give a thorough and lucid account of the expected sequence of events. As the early universe expands, it cools. Initially, the thermal photon bath that we now observe as the CMB has enough energy to keep atoms ionized. The mean free path that a photon can travel before interacting with a charged particle in this early plasma is very short: the early universe is opaque like the interior of a thick cloud. At z = 1090, the temperature drops to the point that photons can no longer break protons and electrons apart. This epoch of recombination marks the transition from an opaque plasma to a transparent universe of neutral hydrogen and helium gas. The path length of photons becomes very long; those that we see as the CMB have traversed the length of the cosmos mostly unperturbed.

Immediately after recombination follows the dark ages. Sources of light have yet to appear. There is just neutral gas expanding into the future. This gas is mostly but not completely transparent. As CMB photons propagate through it, they are subject to absorption by the spin-flip transition of hydrogen, a subtle but, in principle, detectable effect: one should see redshifted absorption across the dark ages.

After some time – perhaps a few hundred million years? – the gas has had enough time to clump up enough to start to form the first structures. This first population of stars ends the dark ages and ushers in cosmic dawn. The photons they release into the vast intergalactic medium (IGM) of neutral gas interacts with it and heats it up, ultimately reionizing the entire universe. After this time the IGM is again a plasma, but one so thin (thanks to the expansion of the universe) that it remains transparent. Galaxies assemble and begin the long evolution characterized by the billions of years lived by the stars the contain.

This progression leads to the expectation of 21cm absorption twice: once during the dark ages, and again at cosmic dawn. There are three temperatures we need to keep track of to see how this happens: the radiation temperature Tγ, the kinetic temperature of the gas, Tk, and the spin temperature, TS. The radiation temperature is that of the CMB, and scales as (1+z). The gas temperature is what you normally think of as a temperature, and scales approximately as (1+z)2. The spin temperature describes the occupation of the quantum levels involved in the 21cm hyperfine transition. If that makes no sense to you, don’t worry: all that matters is that absorption can occur when the spin temperature is less than the radiation temperature. In general, it is bounded by Tk < TS < Tγ.

The radiation temperature and gas temperature both cool as the universe expands. Initially, the gas remains coupled to the radiation, and these temperatures remain identical until decoupling around z ~ 200. After this, the gas cools faster than the radiation. The radiation temperature is extraordinarily well measured by CMB observations, and is simply Tγ = (2.725 K)(1+z). The gas temperature is more complicated, requiring the numerical solution of the Saha equation for a hydrogen-helium gas. Clever people have written codes to do this, like the widely-used RECFAST. In this way, one can build a table of how both temperatures depend on redshift in any cosmology one cares to specify.

This may sound complicated if it is the first time you’ve encountered it, but the physics is wonderfully simple. It’s just the thermal physics of the expanding universe, and the atomic physics of a simple gas composed of hydrogen and helium in known amounts. Different cosmologies specify different expansion histories, but these have only a modest (and calculable) effect on the gas temperature.

Wonderfully, the atomic physics of the 21cm transition is such that it couples to both the radiation and gas temperatures in a way that matters in the early universe. It didn’t have to be that way – most transitions don’t. Perhaps this is fodder for people who worry that the physics of our universe is fine-tuned.

There are two ways in which the spin temperature couples to that of the gas. During the dark ages, the coupling is governed simply by atomic collisions. By cosmic dawn collisions have become rare, but the appearance of the first stars provides UV radiation that drives the WouthuysenField effect. Consequently, we expect to see two absorption troughs: one around z ~ 20 at cosmic dawn, and another at still higher redshift (z ~ 100) during the dark ages.

Observation of this signal has the potential to revolutionize cosmology like detailed observations of the CMB did. The CMB is a snapshot of the universe during the narrow window of recombination at z = 1090. In principle, one can make the same sort of observation with the 21cm line, but at each and every redshift where absorption occurs: z = 16, 17, 18, 19 during cosmic dawn and again at z = 50, 100, 150 during the dark ages, with whatever frequency resolution you can muster. It will be like having the CMB over and over and over again, each redshift providing a snapshot of the universe at a different slice in time.

The information density available from the 21cm signal is in principle quite large. Before we can make use of any of this information, we have to detect it first. Therein lies the rub. This is an incredibly weak signal – we have to be able to detect that the CMB is a little dimmer than it would have been – and we have to do it in the face of much stronger foreground signals from the interstellar medium of our Galaxy and from man-made radio interference here on Earth. Fortunately, though much brighter than the signal we seek, these foregrounds have a different frequency dependence, so it should be possible to sort out, in principle.

Saying a thing can be done and doing it are two different things. This is already a long post, so I will refrain from raving about the technical challenges. Lets just say it’s Real Hard.

Many experimentalists take that as a challenge, and there are a good number of groups working hard to detect the cosmic 21cm signal. EDGES appears to have done it, reporting the detection of the signal at cosmic dawn in February. Here some weasel words are necessary, as the foreground subtraction is a huge challenge, and we always hope to see independent confirmation of a new signal like this. Those words of caution noted, I have to add that I’ve had the chance to read up on their methods, and I’m really impressed. Unlike the BICEP claim to detect primordial gravitational waves that proved to be bogus after being rushed to press release before refereering, the EDGES team have done all manner of conceivable cross-checks on their instrumentation and analysis. Nor did they rush to publish, despite the importance of the result. In short, I get exactly the opposite vibe from BICEP, whose foreground subtraction was obviously wrong as soon as I laid eyes on the science paper. If EDGES proves to be wrong, it isn’t for want of doing things right. In the meantime, I think we’re obliged to take their result seriously, and not just hope it goes away (which seems to be the first reaction to the impossible).

Here is what EDGES saw at cosmic dawn:

Fig. 2 from the EDGES detection paper. The dip, detected repeatedly in different instrumental configurations, shows a decrease in brightness temperature at radio frequencies, as expected from the 21cm absorbing some of the radiation from the CMB.

The unbelievable aspect of the EDGES observation is that it is too strong. Feeble as this signal is (a telescope brightness decrement of half a degree Kelvin), after subtracting foregrounds a thousand times stronger, it is twice as much as is possible in ΛCDM.

I made a quick evaluation of this, and saw that the observed signal could be achieved if the baryon fraction of the universe was high – basically, if cold dark matter did not exist. I have now had the time to make a more careful calculation, and publish some further predictions. The basic result from before stands: the absorption should be stronger without dark matter than with it.

The reason for this is simple. A universe full of dark matter decelerates rapidly at early times, before the acceleration of the cosmological constant kicks in. Without dark matter, the expansion more nearly coasts. Consequently, the universe is relatively larger from 10 < z < 1000, and the CMB photons have to traverse a larger path length to get here. They have to go about twice as far through the same density of hydrogen absorbers. It’s like putting on a second pair of sunglasses.

Quantitatively, the predicted absorption, both with dark matter and without, looks like:

The predicted 21cm absorption with dark matter (red broken line) and without (blue line). Also shown (in grey) is the signal observed by EDGES.


The predicted absorption is consistent with the EDGES observation, within the errors, if there is no dark matter. More importantly, ΛCDM is not consistent with the data, at greater than 95% confidence. At cosmic dawn, I show the maximum possible signal. It could be weaker, depending on the spectra of the UV radiation emitted by the first stars. But it can’t be stronger. Taken at face value, the EDGES result is impossible in ΛCDM. If the observation is corroborated by independent experiments, ΛCDM as we know it will be falsified.

There have already been many papers trying to avoid this obvious conclusion. If we insist on retaining ΛCDM, the only way to modulate the strength of the signal is to alter the ratio of the radiation temperature to the gas temperature. Either we make the radiation “hotter,” or we make the gas cooler. If we allow ourselves this freedom, we can fit any arbitrary signal strength. This is ad hoc in the way that gives ad hoc a bad name.

We do not have this freedom – not really. The radiation temperature is measured in the CMB with great accuracy. Altering this would mess up the genuine success of ΛCDM in fitting the CMB. One could postulate an additional source, something that appears after recombination but before cosmic dawn to emit enough radio power throughout the cosmos to add to the radio brightness that is being absorbed. There is zero reason to expect such sources (what part of `cosmic dawn’ was ambiguous?) and no good way to make them at the right time. If they are primordial (as people love to imagine but are loathe to provide viable models for) then they’re also present at recombination: anything powerful enough to have the necessary effect will likely screw up the CMB.

Instead of magically increasing the radiation temperature, we might decrease the gas temperature. This seems no more plausible. The evolution of the gas temperature is a straightforward numerical calculation that has been checked by several independent codes. It has to be right at the time of recombination, or again, we mess up the CMB. The suggestions that I have heard seem mostly to invoke interactions between the gas and dark matter that offload some of the thermal energy of the gas into the invisible sink of the dark matter. Given how shy dark matter has been about interacting with normal matter in the laboratory, it seems pretty rich to imagine that it is eager to do so at high redshift. Even advocates of this scenario recognize its many difficulties.

For those who are interested, I cite a number of the scientific papers that attempt these explanations in my new paper. They all seem like earnest attempts to come to terms with what is apparently impossible. Many of these ideas also strike me as a form of magical thinking that stems from ΛCDM groupthink. After all, ΛCDM is so well established, any unexpected signal must be a sign of exciting new physics (on top of the new physics of dark matter and dark energy) rather than an underlying problem with ΛCDM itself.

The more natural interpretation is that the expansion history of the universe deviates from that predicted by ΛCDM. Simply taking away the dark matter gives a result consistent with the data. Though it did not occur to me to make this specific prediction a priori for an experiment that did not yet exist, all the necessary calculations had been done 15 years ago.

Using the same model, I make a genuine a priori prediction for the dark ages. For the specific NoCDM model I built in 2004, the 21cm absorption in the dark ages should again be about twice as strong as expected in ΛCDM. This seems fairly generic, but I know the model is not complete, so I wouldn’t be upset if it were not bang on.

I would be upset if ΛCDM were not bang on. The only thing that drives the signal in the dark ages is atomic scattering. We understand this really well. ΛCDM is now so well constrained by Planck that, if right, the 21cm absorption during the dark ages must follow the red line in the inset in the figure. The amount of uncertainty is not much greater than the thickness of the line. If ΛCDM fails this test, it would be a clear falsification, and a sign that we need to try something completely different.

Unfortunately, detecting the 21cm absorption signal during the dark ages is even harder than it is at cosmic dawn. At these redshifts (z ~ 100), the 21cm line (1420 MHz on your radio dial) is shifted beyond the ionospheric cutoff of the Earth’s atmosphere at 30 MHz. Frequencies this low cannot be observed from the ground. Worse, we have made the Earth itself a bright foreground contaminant of radio frequency interference.

Undeterred, there are multiple proposals to measure this signal by placing an antenna in space – in particular, on the far side of the moon, so that the moon shades the instrument from terrestrial radio interference. This is a great idea. The mere detection of the 21cm signal from the dark ages would be an accomplishment on par with the original detection of the CMB. It appears that it might also provide a decisive new way of testing our cosmological model.

There are further tests involving the shape of the 21cm signal, its power spectrum (analogous to the power spectrum of the CMB), how structure grows in the early ages of the universe, and how massive the neutrino is. But that’s enough for now.


Most likely beer. Or a cosmo. That’d be appropriate. I make a good pomegranate cosmo.

*Note that a variety of astronomical observations had established the concordance cosmology before Type Ia supernovae detected cosmic acceleration and well-resolved observations of the CMB found a flat cosmic geometry.

The dwarf galaxy NGC1052-DF2

The dwarf galaxy NGC1052-DF2

A recently discovered dwarf galaxy designated NGC1052-DF2 has been in the news lately. Apparently a satellite of the giant elliptical NGC 1052, DF2 (as I’ll call it from here on out) is remarkable for having a surprisingly low velocity dispersion for a galaxy of its type. These results were reported in Nature last week by van Dokkum et al., and have caused a bit of a stir.

It is common for giant galaxies to have some dwarf satellite galaxies. As can be seen from the image published by van Dokkum et al., there are a number of galaxies in the neighborhood of NGC 1052. Whether these are associated physically into a group of galaxies or are chance projections on the sky depends on the distance to each galaxy.

Image of field containing DF2 from van Dokkum et al.

NGC 1052 is listed by the NASA Extragalactic Database (NED) as having a recession velocity of 1510 km/s and a distance of 20.6 Mpc. The next nearest big beastie is NGC 1042, at 1371 km/s. The difference of 139 km/s is not much different from 115 km/s, which is the velocity that Andromeda is heading towards the Milky Way, so one could imagine that this is a group similar to the Local Group. Except that NED says the distance to NGC 1042 is 7.8 Mpc, so apparently it is a foreground object seen in projection.

Van Dokkum et al. assume DF2 and NGC 1052 are both about 20 Mpc distant. They offer two independent estimates of the distance, one consistent with the distance to NGC 1052 and the other more consistent with the distance to NGC 1042. Rather than wring our hands over this, I will trust their judgement and simply note, as they do, that the nearer distance would change many of their conclusions. The redshift is 1803 km/s, larger than either of the giants. It could still be a satellite of NGC 1052, as ~300 km/s is not unreasonable for an orbital velocity.

So why the big fuss? Unlike most galaxies in the universe, DF2 appears not to require dark matter. This is inferred from the measured velocity dispersion of ten globular clusters, which is 8.4 km/s. That’s fast to you and me, but rather sluggish on the scale of galaxies. Spread over a few kiloparsecs, that adds up to a dynamical mass about equal to what we expect for the stars, leaving little room for the otherwise ubiquitous dark matter.

This is important. If the universe is composed of dark matter, it should on occasion be possible to segregate the dark from the light. Tidal interactions between galaxies can in principle do this, so a galaxy devoid of dark matter would be good evidence that this happened. It would also be evidence against a modified gravity interpretation of the missing mass problem, because the force law is always on: you can’t strip it from the luminous matter the way you can dark matter. So ironically, the occasional galaxy lacking dark matter would constitute evidence that dark matter does indeed exist!

DF2 appears to be such a case. But how weird is it? Morphologically, it resembles the dwarf spheroidal satellite galaxies of the Local Group. I have a handy compilation of those (from Lelli et al.), so we can compute the mass-to-light ratio for all of these beasties in the same fashion, shown in the figure below. It is customary to refer quantities to the radius that contains half of the total light, which is 2.2 kpc for DF2.

The dynamical mass-to-light ratio for Local Group dwarf Spheroidal galaxies measured within their half-light radii, as a function of luminosity (left) and average surface brightness within the half-light radius (right). DF2 is the blue cross with low M/L. The other blue cross is Crater 2, a satellite of the Milky Way discovered after the compilation of Local Group dwarfs was made. The dotted line shows M/L = 2, which is a good guess for the stellar mass-to-light ratio. That DF2 sits on this line implies that stars are the only mass that’s there.

Perhaps the most obvious respect in which DF2 is a bit unusual relative to the dwarfs of the Local Group is that it is big and bright. Most nearby dwarfs have half light radii well below 1 kpc. After DF2, the next most luminous dwarfs is Fornax, which is a factor of 5 lower in luminosity.

DF2 is called an ultradiffuse galaxy (UDG), which is apparently newspeak for low surface brightness (LSB) galaxy. I’ve been working on LSB galaxies my entire career. While DF2 is indeed low surface brightness – the stars are spread thin – I wouldn’t call it ultra diffuse. It is actually one of the higher surface brightness objects of this type. Crater 2 and And XIX (the leftmost points in the right panel) are ultradiffuse.

Astronomers love vague terminology, and as a result often reinvent terms that already exist. Dwarf, LSB, UDG, have all been used interchangeably and with considerable slop. I was sufficiently put out by this that I tried to define some categories is the mid-90s. This didn’t catch on, but by my definition, DF2 is VLSB – very LSB, but only by a little – it is much closer to regular LSB than to extremely (ELSB). Crater 2 and And XIX, now they’re ELSB, being more diffuse than DF2 by 2 orders of magnitude.

Surface brightness categories from McGaugh (1996).

Whatever you call it, DF2 is low surface brightness, and LSB galaxies are always dark matter dominated. Always, at least among disk galaxies: here is the analogous figure for galaxies that rotate:

Dynamical mass-to-light ratios for rotationally supported disk galaxies, analogous to the plot above for pressure supported disks. The lower the surface brightness, the higher the mass discrepancy. The correlation with luminosity is secondary, as a result of the correlation between luminosity and surface brightness. From McGaugh (2014).

Pressure supported dwarfs generally evince large mass discrepancies as well. So in this regard, DF2 is indeed very unusual. So what gives?

Perhaps DF2 formed that way, without dark matter. This is anathema to everything we know about galaxy formation in ΛCDM cosmology. Dark halos have to form first, with baryons following.

Perhaps DF2 suffered one or more tidal interactions with NGC 1052. Sub-halos in simulations are often seen to be on highly radial orbits; perhaps DF2 has had its dark matter halo stripped away by repeated close passages. Since the stars reside deep in the center of the subhalo, they’re the last thing to be stripped away. So perhaps we’ve caught this one at that special time when the dark matter has been removed but the stars still remain.

This is improbable, but ought to happen once in a while. The bigger problem I see is that one cannot simply remove the dark matter halo like yanking a tablecloth and leaving the plates. The stars must respond to the change in the gravitational potential; they too must diffuse away. That might be a good way to make the galaxy diffuse, ultimately perhaps even ultradiffuse, but the observed motions are then not representative of an equilibrium situation. This is critical to the mass estimate, which must perforce assume an equilibrium in which the gravitational potential well of the galaxy is balanced against the kinetic motion of its contents. Yank away the dark matter halo, and the assumption underlying the mass estimate gets yanked with it. While such a situation may arise, it makes it very difficult to interpret the velocities: all tests are off. This is doubly true in MOND, in which dwarfs are even more susceptible to disruption.


Then there are the data themselves. Blaming the data should be avoided, but it does happen once in a while that some observation is misleading. In this case, I am made queasy by the fact that the velocity dispersion is estimated from only ten tracers. I’ve seen plenty of cases where the velocity dispersion changes in important ways when more data are obtained, even starting from more than 10 tracers. Andromeda II comes to mind as an example. Indeed, several people have pointed out that if we did the same exercise with Fornax, using its globular clusters as the velocity tracers, we’d get a similar answer to what we find in DF2. But we also have measurements of many hundreds of stars in Fornax, so we know that answer is wrong. Perhaps the same thing is happening with DF2? The fact that DF2 is an outlier from everything else we know empirically suggests caution.

Throwing caution and fact-checking to the wind, many people have been predictably eager to cite DF2 as a falsification of MOND. Van Dokkum et al. point out the the velocity dispersion predicted for this object by MOND is 20 km/s, more than a factor of two above their measured value. They make the MOND prediction for the case of an isolated object. DF2 is not isolated, so one must consider the external field effect (EFE).

The criterion by which to judge isolation in MOND is whether the acceleration due to the mutual self-gravity of the stars is less than the acceleration from an external source, in this case the host NGC 1052. Following the method outlined by McGaugh & Milgrom, and based on the stellar mass (adopting M/L=2 as both we and van Dokkum assume), I estimate an internal acceleration of DF2 to be gin = 0.15 a0. Here a0 is the critical acceleration scale in MOND, 1.2 x 10-10 m/s/s. Using this number and treating DF2 as isolated, I get the same 20 km/s van Dokkum et al. estimate.

Estimating the external field is more challenging. It depends on the mass of NGC 1052, and the separation between it and DF2. The projected separation at the assumed distance is 80 kpc. That is well within the range that the EFE is commonly observed to matter in the Local Group. It could be a bit further granted some distance along the line of sight, but if this becomes too large then the distance by association with NGC 1052 has to be questioned, and all bets are off. The mass of NGC 1052 is also rather uncertain, or at least I have heard wildly different values quoted in discussions about this object. Here I adopt 1011 M as estimated by SLUGGS. To get the acceleration, I estimate the asymptotic rotation velocity we’d expect in MOND, V4 = a0GM. This gives 200 km/s, which is conservative relative to the ~300 km/s quoted by van Dokkum et al. At a distance of 80 kpc, the corresponding external acceleration gex = 0.14 a0. This is very uncertain, but taken at face value is indistinguishable from the internal acceleration. Consequently, it cannot be ignored: the calculation published by van Dokkum et al. is not the correct prediction for MOND.

The velocity dispersion estimator in MOND differs when gex < gin and gex > gin (see equations 2 and 3 of McGaugh & Milgrom). Strictly speaking, these apply in the limits where one or the other field dominates. When they are comparable, the math gets more involved (see equation 59 of Famaey & McGaugh). The input data are too uncertain to warrant an elaborate calculation for a blog, so I note simply that the amplitude of the mass discrepancy in MOND depends on how deep in the MOND regime a system is. That is, how far below the critical acceleration scale it is. The lower the acceleration, the larger the discrepancy. This is why LSB galaxies appear to be dark matter dominated; their low surface densities result in low accelerations.

For DF2, the absolute magnitude of the acceleration is approximately doubled by the presence of the external field. It is not as deep in the MOND regime as assumed in the isolated case, so the mass discrepancy is smaller, decreasing the MOND-predicted velocity dispersion by roughly the square root of 2. For a factor of 2 range in the stellar mass-to-light ratio (as in McGaugh & Milgrom), this crude MOND prediction becomes

σ = 14 ± 4 km/s.

Like any erstwhile theorist, I reserve the right to modify this prediction granted more elaborate calculations, or new input data, especially given the uncertainties in the distance and mass of the host. Indeed, we should consider the possibility of tidal disruption, which can happen in MOND more readily than with dark matter. Indeed, at one point I came very close to declaring MOND dead because the velocity dispersions of the ultrafaint dwarf galaxies were off, only realizing late in the day that MOND actually predicts that these things should be getting tidally disrupted (as is also expected, albeit somewhat differently, in ΛCDM), so that the velocity dispersions might not reflect the equilibrium expectation.

In DF2, the external field almost certainly matters. Barring wild errors of the sort discussed or unforeseen, I find it hard to envision the MONDian velocity dispersion falling outside the range 10 – 18 km/s. This is not as high as the 20 km/s predicted by van Dokkum et al. for an isolated object, nor as small as they measure for DF2 (8.4 km/s). They quote a 90% confidence upper limit of 10 km/s, which is marginally consistent with the lower end of the prediction (corresponding to M/L = 1). So we cannot exclude MOND based on these data.

That said, the agreement is marginal. Still, 90% is not very high confidence by scientific standards. Based on experience with such data, this likely overstates how well we know the velocity dispersion of DF2. Put another way, I am 90% confident that when better data are obtained, the measured velocity dispersion will increase above the 10 km/s threshold.

More generally, experience has taught me three things:

  1. In matters of particle physics, do not bet against the Standard Model.
  2. In matters cosmological, do not bet against ΛCDM.
  3. In matters of galaxy dynamics, do not bet against MOND.

The astute reader will realize that these three assertions are mutually exclusive. The dark matter of ΛCDM is a bet that there are new particles beyond the Standard Model. MOND is a bet that what we call dark matter is really the manifestation of physics beyond General Relativity, on which cosmology is based. Which is all to say, there is still some interesting physics to be discovered.

Degenerating problemshift: a wedged paradigm in great tightness

Degenerating problemshift: a wedged paradigm in great tightness

Reading Merritt’s paper on the philosophy of cosmology, I was struck by a particular quote from Lakatos:

A research programme is said to be progressing as long as its theoretical growth anticipates its empirical growth, that is as long as it keeps predicting novel facts with some success (“progressive problemshift”); it is stagnating if its theoretical growth lags behind its empirical growth, that is as long as it gives only post-hoc explanations either of chance discoveries or of facts anticipated by, and discovered in, a rival programme (“degenerating problemshift”) (Lakatos, 1971, pp. 104–105).

The recent history of modern cosmology is rife with post-hoc explanations of unanticipated facts. The cusp-core problem and the missing satellites problem are prominent examples. These are explained after the fact by invoking feedback, a vague catch-all that many people agree solves these problems even though none of them agree on how it actually works.

Cartoon of the feedback explanation for the difference between the galaxy luminosity function (blue line) and the halo mass function (red line). From Silk & Mamon (2012).

There are plenty of other problems. To name just a few: satellite planes (unanticipated correlations in phase space), the emptiness of voids, and the early formation of structure  (see section 4 of Famaey & McGaugh for a longer list and section 6 of Silk & Mamon for a positive spin on our list). Each problem is dealt with in a piecemeal fashion, often by invoking solutions that contradict each other while buggering the principle of parsimony.

It goes like this. A new observation is made that does not align with the concordance cosmology. Hands are wrung. Debate is had. Serious concern is expressed. A solution is put forward. Sometimes it is reasonable, sometimes it is not. In either case it is rapidly accepted so long as it saves the paradigm and prevents the need for serious thought. (“Oh, feedback does that.”) The observation is no longer considered a problem through familiarity and exhaustion of patience with the debate, regardless of how [un]satisfactory the proffered solution is. The details of the solution are generally forgotten (if ever learned). When the next problem appears the process repeats, with the new solution often contradicting the now-forgotten solution to the previous problem.

This has been going on for so long that many junior scientists now seem to think this is how science is suppose to work. It is all they’ve experienced. And despite our claims to be interested in fundamental issues, most of us are impatient with re-examining issues that were thought to be settled. All it takes is one bold assertion that everything is OK, and the problem is perceived to be solved whether it actually is or not.

“Is there any more?”

That is the process we apply to little problems. The Big Problems remain the post hoc elements of dark matter and dark energy. These are things we made up to explain unanticipated phenomena. That we need to invoke them immediately casts the paradigm into what Lakatos called degenerating problemshift. Once we’re there, it is hard to see how to get out, given our propensity to overindulge in the honey that is the infinity of free parameters in dark matter models.

Note that there is another aspect to what Lakatos said about facts anticipated by, and discovered in, a rival programme. Two examples spring immediately to mind: the Baryonic Tully-Fisher Relation and the Radial Acceleration Relation. These are predictions of MOND that were unanticipated in the conventional dark matter picture. Perhaps we can come up with post hoc explanations for them, but that is exactly what Lakatos would describe as degenerating problemshift. The rival programme beat us to it.

In my experience, this is a good description of what is going on. The field of dark matter has stagnated. Experimenters look harder and harder for the same thing, repeating the same experiments in hope of a different result. Theorists turn knobs on elaborate models, gifting themselves new free parameters every time they get stuck.

On the flip side, MOND keeps predicting novel facts with some success, so it remains in the stage of progressive problemshift. Unfortunately, MOND remains incomplete as a theory, and doesn’t address many basic issues in cosmology. This is a different kind of unsatisfactory.

In the mean time, I’m still waiting to hear a satisfactory answer to the question I’ve been posing for over two decades now. Why does MOND get any predictions right? It has had many a priori predictions come true. Why does this happen? It shouldn’t. Ever.

Neutrinos got mass!

Neutrinos got mass!

In 1984, I heard Hans Bethe give a talk in which he suggested the dark matter might be neutrinos. This sounded outlandish – from what I had just been taught about the Standard Model, neutrinos were massless. Worse, I had been given the clear impression that it would screw everything up if they did have mass. This was the pervasive attitude, even though the solar neutrino problem was known at the time. This did not compute! so many of us were inclined to ignore it. But, I thought, in the unlikely event it turned out that neutrinos did have mass, surely that would be the answer to the dark matter problem.

Flash forward a few decades, and sure enough, neutrinos do have mass. Oscillations between flavors of neutrinos have been observed in both solar and atmospheric neutrinos. This implies non-zero mass eigenstates. We don’t yet know the absolute value of the neutrino mass, but the oscillations do constrain the separation between mass states (Δmν,212 = 7.53×10−5 eV2 for solar neutrinos, and Δmν,312 = 2.44×10−3 eV2 for atmospheric neutrinos).

Though the absolute values of the neutrino mass eigenstates are not yet known, there are upper limits. These don’t allow enough mass to explain the cosmological missing mass problem. The relic density of neutrinos is

Ωνh2 = ∑mν/(93.5 eV)

In order to make up the dark matter density (Ω ≈ 1/4), we need ∑mν ≈ 12 eV. The experimental upper limit on the electron neutrino mass is mν < 2 eV. There are three neutrino mass eigenstates, and the difference in mass between them is tiny, so ∑mν < 6 eV. Neutrinos could conceivably add up to more mass than baryons, but they cannot add up to be the dark matter.

In recent years, I have started to hear the assertion that we have already detected dark matter, with neutrinos given as the example. They are particles with mass that only interact with us through the weak nuclear force and gravity. In this respect, they are like WIMPs.

Here the equivalence ends. Neutrinos are Standard Model particles that have been known for decades. WIMPs are hypothetical particles that reside in a hypothetical supersymmetric sector beyond the Standard Model. Conflating the two to imply that WIMPs are just as natural as neutrinos is a false equivalency.

That said, massive neutrinos might be one of the few ways in which hierarchical cosmogony, as we currently understand it, is falsifiable. Whatever the dark matter is, we need it to be dynamically cold. This property is necessary for it to clump into dark matter halos that seed galaxy formation. Too much hot (relativistic) dark matter (neutrinos) suppresses structure formation. A nascent dark matter halo is nary a speed bump to a neutrino moving near the speed of light: if those fast neutrinos carry too much mass, they erase structure before it can form.

One of the great successes of ΛCDM is its explanation of structure formation: the growth of large scale structure from the small fluctuations in the density field at early times. This is usually quantified by the power spectrum – in the CMB at z > 1000 and from the spatial distribution of galaxies at z = 0. This all works well provided the dominant dark mass is dynamically cold, and there isn’t too much hot dark matter fighting it.

The power spectrum from the CMB (low frequency/large scales) and the galaxy distribution (high frequency/”small” scales). Adapted from Whittle.

How much is too much? The power spectrum puts strong limits on the amount of hot dark matter that is tolerable. The upper limit is ∑mν < 0.12 eV. This is an order of magnitude stronger than direct experimental constraints.

Usually, it is assumed that the experimental limit will eventually come down to the structure formation limit. That does seem likely, but it is also conceivable that the neutrino mass has some intermediate value, say mν ≈ 1 eV. Such a result, were it to be obtained experimentally, would falsify the current CDM cosmogony.

Such a result seems unlikely, of course. Shooting for a narrow window such as the gap between the current cosmological and experimental limits is like drawing to an inside straight. It can happen, but it is unwise to bet the farm on it.

It should be noted that a circa 1 eV neutrino would have some desirable properties in an MONDian universe. MOND can form large scale structure, much like CDM, but it does so faster. This is good for clearing out the voids and getting structure in place early, but it tends to overproduce structure by z = 0. An admixture of neutrinos might help with that. A neutrino with an appreciable mass would also help with the residual mass discrepancy MOND suffers in clusters of galaxies.

If experiments measure a neutrino mass in excess of the cosmological limit, it would be powerful motivation to consider MOND-like theories as a driver of structure formation. If instead the neutrino does prove to be tiny, ΛCDM will have survived another test. That wouldn’t falsify MOND (or really have any bearing on it), but it would remove one potential “out” for the galaxy cluster problem.

Tiny though they be, neutrinos got mass! And it matters!