Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Dwarf Satellite Galaxies. III. The dwarfs of Andromeda

Like the Milky Way, our nearest giant neighbor, Andromeda (aka M31), has several dozen dwarf satellite galaxies. A few of these were known and had measured velocity dispersions at the time of my work with Joe Wolf, as discussed previously. Also like the Milky Way, the number of known objects has grown rapidly in recent years – thanks in this case largely to the PAndAS survey.

PAndAS imaged the area around M31 and M33, finding many individual red giant stars. These trace out the debris from interactions and mergers as small dwarfs are disrupted and consumed by their giant host. They also pointed up the existence of previously unknown dwarf satellites.

M31fromPANDAS_ McC2012_EPJ_19_01003
The PAndAS survey field. Dwarf satellites are circled.

As the PAndAS survey started reporting the discovery of new dwarf satellites around Andromeda, it occurred to me that this provided the opportunity to make genuine a priori predictions. These are the gold standard of the scientific method. We could use the observed luminosity and size of the newly discovered dwarfs to predict their velocity dispersions.

I tried to do this for both ΛCDM and MOND. I will not discuss the ΛCDM case much, because it can’t really be done. But it is worth understanding why this is.

In ΛCDM, the velocity dispersion is determined by the dark matter halo. This has only a tenuous connection to the observed stars, so just knowing how big and bright a dwarf is doesn’t provide much predictive power about the halo. This can be seen from this figure by Tollerud et al (2011):

Virial mass of the dark matter halo as a function of galaxy luminosity. Dwarfs satellites reside in the wide colored band of low luminosities.

This graph is obtained by relating the number density of galaxies (an observed quantity) to that of the dark matter halos in which they reside (a theoretical construct). It is highly non-linear, deviating strongly from the one-to-one line we expected early on. There is no reason to expect this particular relation; it is imposed on us by the fact that the observed luminosity function of galaxies is rather flat while the predicted halo mass function is steep. Nowadays, this is usually called the missing satellite problem, but this is a misnomer as it pervades the field.

Addressing the missing satellites problem would be another long post, so lets just accept that the relation between mass and light has to follow something like that illustrated above. If a dwarf galaxy has a luminosity of a million suns, one can read off the graph that it should live in a dark halo with a mass of about 1010 M. One could use this to predict the velocity dispersion, but not very precisely, because there’s a big range corresponding to that luminosity (the bands in the figure). It could be as much as 1011 M or as little as 109 M. This corresponds to a wide range of velocity dispersions. This wide range is unavoidable because of the difference in the luminosity function and halo mass function. Small variations in one lead to big variations in the other, and some scatter in dark halo properties is unavoidable.

Consequently, we only have a vague range of expected velocity dispersions in ΛCDM. In practice, we never make this prediction. Instead, we compare the observed velocity dispersion to the luminosity and say “gee, this galaxy has a lot of dark matter” or “hey, this one doesn’t have much dark matter.” There’s no rigorously testable prior.

In MOND, what you see is what you get. The velocity dispersion has to follow from the observed stellar mass. This is straightforward for isolated galaxies: M* ∝ σ4 – this is essentially the equivalent of the Tully-Fisher relation for pressure supported systems. If we can estimate the stellar mass from the observed luminosity, the predicted velocity dispersion follows.

Many dwarf satellites are not isolated in the MONDian sense: they are subject to the external field effect (EFE) from their giant hosts. The over-under for whether the EFE applies is the point when the internal acceleration from all the stars of the dwarf on each other is equal to the external acceleration from orbiting the giant host. The amplitude of the discrepancy in MOND depends on how low the total acceleration is relative to the critical scale a0. The external field in effect adds some acceleration that wouldn’t otherwise be there, making the discrepancy less than it would be for an isolated object. This means that two otherwise identical dwarfs may be predicted to have different velocity dispersions is they are or are not subject to the EFE. This is a unique prediction of MOND that has no analog in ΛCDM.

It is straightforward to derive the equation to predict velocity dispersions in the extreme limits of isolated (aex ≪ ain < a0) or EFE dominated (ain ≪ aex < a0) objects. In reality, there are many objects for which ain ≈ aex, and no simple formula applies. In practice, we apply the formula that more nearly applies, and pray that this approximation is good enough.

There are many other assumptions and approximations that must be made in any theory: that an object is spherical, isotropic, and in dynamical equilibrium. All of these must fail at some level, but it is the last one that is the most serious concern. In the case of the EFE, one must also make the approximation that the object is in equilibrium at the current level of the external field. That is never true, as both the amplitude and the vector of the external field vary as a dwarf orbits its host. But it might be an adequate approximation if this variation is slow. In the case of a circular orbit, only the vector varies. In general the orbits are not known, so we make the instantaneous approximation and once again pray that it is good enough. There is a fairly narrow window between where the EFE becomes important and where we slip into the regime of tidal disruption, but lets plow ahead and see how far we can get, bearing in mind that the EFE is a dynamical variable of which we only have a snapshot.

To predict the velocity dispersion in the isolated case, all we need to know is the luminosity and a stellar mass-to-light ratio. Assuming the dwarfs of Andromeda to be old stellar populations, I adopted a V-band mass-to-light ratio of 2 give or take a factor of 2. That usually dominates the uncertainty, though the error in the distance can sometimes impact the luminosity at a level that impacts the prediction.

To predict the velocity dispersion in the EFE case, we again need the stellar mass, but now also need to know the size of the stellar system and the intensity of the external field to which it is subject. The latter depends on the mass of the host galaxy and the distance from it to the dwarf. This latter quantity is somewhat fraught: it is straightforward to measure the projected distance on the sky, but we need the 3D distance – how far in front or behind each dwarf is as well as its projected distance from the host. This is often a considerable contributor to the error budget. Indeed, some dwarfs may be inferred to be in the EFE regime for the low end of the range of adopted stellar mass-to-light ratio, and the isolated regime for the high end.

In this fashion, we predicted velocity dispersions for the dwarfs of Andromeda. We in this case were Milgrom and myself. I had never collaborated with him before, and prefer to remain independent. But I also wanted to be sure I got the details described above right. Though it wasn’t much work to make the predictions once the preliminaries were established, it was time consuming to collect and vet the data. As we were writing the paper, velocity dispersion measurements started to appear. People like Michelle Collins, Erik Tollerud, and Nicolas Martin were making follow-up observations, and publishing velocity dispersion for the objects we were making predictions for. That was great, but they were too good – they were observing and publishing faster than we could write!

Nevertheless, we managed to make and publish a priori predictions for 10 dwarfs before any observational measurements were published. We also made blind predictions for the other known dwarfs of Andromeda, and checked the predicted velocity dispersions against all measurements that we could find in the literature. Many of these predictions were quickly tested by on-going programs (i.e., people were out to measure velocity dispersions, whether we predicted them or not). Enough data rolled in that we were soon able to write a follow-up paper testing our predictions.

Nailed it. Good data were soon available to test the predictions for 8 of the 10* a priori cases. All 8 were consistent with our predictions. I was particularly struck by the case of And XXVIII, which I had called out as perhaps the best test. It was isolated, so the messiness of the EFE didn’t apply, and the uncertainties were low. Moreover, the predicted velocity dispersion was low – a good deal lower than broadly expected in ΛCDM: 4.3 km/s, with an uncertainty just under 1 km/s. Two independent observations were subsequently reported. One found 4.9 ± 1.6 km/s, the other 6.6 ± 2.1 km/s, both in good agreement within the uncertainties.

We made further predictions in the second paper as people had continued to discover new dwarfs. These also came true. Here is a summary plot for all of the dwarfs of Andromeda:

The velocity dispersions of the dwarf satellites of Andromeda. Each numbered box corresponds to one dwarf (x=1 is for And I and so on). Measured velocity dispersions have a number next to them that is the number of stars on which the measurement is based. MOND predictions are circles: green if isolated, open if the EFE applies. Points appear within each box in the order they appeared in the literature, from left to right. The vast majority of Andromeda’s dwarfs are consistent with MOND (large green circles). Two cases are ambiguous (large yellow circles), having velocity dispersions based only a few stars. Only And V appears to be problematic (large red circle).

MOND works well for And I, And II, And III, And VI, And VII, And IX, And X, And XI, And XII, And XIII, And XIV, And XV, And XVI, And XVII, And XVIII, And XIX, And XX, And XXI, And XXII, And XXIII, And XXIV, And XXV, And XXVIII, And XXIX, And XXXI, And XXXII, and And XXXIII. There is one problematic case: And V. I don’t know what is going on there, but note that systematic errors frequently happen in astronomy. It’d be strange if there weren’t at least one goofy case.

Nevertheless, the failure of And V could be construed as a falsification of MOND. It ought to work in every single case. But recall the discussion of assumptions and uncertainties above. Is falsification really the story these data tell?

We do have experience with various systematic errors. For example, we predicted that the isolated dwarfs spheroidal Cetus should have a velocity dispersion in MOND of 8.2 km/s. There was already a published measurement of 17 ± 2 km/s, so we reported that MOND was wrong in this case by over 3σ. Or at least we started to do so. Right before we submitted that paper, a new measurement appeared: 8.3 ± 1 km/s. This is an example of how the data can sometimes change by rather more than the formal error bars suggest is possible. In this case, I suspect the original observations lacked the spectral resolution to resolve the velocity dispersion. At any rate, the new measurement (8.3 km/s) was somewhat more consistent with our prediction (8.2 km/s).

The same predictions cannot even be made in ΛCDM. The velocity data can always be fit once they are in hand. But there is no agreed method to predict the velocity dispersion of a dwarf from its observed luminosity. As discussed above, this should not even be possible: there is too much scatter in the halo mass-stellar mass relation at these low masses.

An unsung predictive success of MOND absent from the graph above is And IV. When And IV was discovered in the general direction of Andromeda, it was assumed to be a new dwarf satellite – hence the name. Milgrom looked at the velocities reported for this object, and said it had to be a background galaxy. No way it could be a dwarf satellite – at least not in MOND. I see no reason why it couldn’t have been in ΛCDM. It is absent from the graph above, because it was subsequently confirmed to be much farther away (7.2 Mpc vs. 750 kpc for Andromeda).

The box for And XVII is empty because this system is manifestly out of equilibrium. It is more of a stellar stream than a dwarf, appearing as a smear in the PAndAS image rather than as a self-contained dwarf. I do not recall what the story with the other missing object (And VIII) is.

While writing the follow-up paper, I also noticed that there were a number of Andromeda dwarfs that were photometrically indistinguishable: basically the same in terms of size and stellar mass. But some were isolated while others were subject to the EFE. MOND predicts that the EFE cases should have lower velocity dispersion than the isolated equivalents.

The velocity dispersions of the dwarfs of Andromeda, highlighting photometrically matched pairs – dwarfs that should be indistinguishable, but aren’t because of the EFE.

And XXVIII (isolated) has a higher velocity dispersion than its near-twin And XVII (EFE). The same effect might be acting in And XVIII (isolated) and And XXV (EFE). This is clear if we accept the higher velocity dispersion measurement for And XVIII, but an independent measurements begs to differ. The former has more stars, so is probably more reliable, but we should be cautious. The effect is not clear in And XVI (isolated) and And XXI (EFE), but the difference in the prediction is small and the uncertainties are large.

An aggressive person might argue that the pairs of dwarfs is a positive detection of the EFE. I don’t think the data for the matched pairs warrant that, at least not yet. On the other hand, the appropriate use of the EFE was essential to all the predictions, not just the matched pairs.

The positive detection of the EFE is important, as it is a unique prediction of MOND. I see no way to tune ΛCDM galaxy simulations to mimic this effect. Of course, there was a  very recent time when it seemed impossible for them to mimic the isolated predictions of MOND. They claim to have come a long way in that regard.

But that’s what we’re stuck with: tuning ΛCDM to make it look like MOND. This is why a priori predictions are important. There is ample flexibility to explain just about anything with dark matter. What we can’t seem to do is predict the same things that MOND successfully predicts… predictions that are both quantitative and very specific. We’re not arguing that dwarfs in general live in ~15 or 30 km/s halos, as we must in ΛCDM. In MOND we can say this dwarf will have this velocity dispersion and that dwarf will have that velocity dispersion. We can distinguish between 4.9 and 7.3 km/s. And we can do it over and over and over. I see no way to do the equivalent in ΛCDM, just as I see no way to explain the acoustic power spectrum of the CMB in MOND.

This is not to say there are no problematic cases for MOND. Read, Walker, & Steger have recently highlighted the matched pair of Draco and Carina as an issue. And they are – though here I already have reason to suspect Draco is out of equilibrium, which makes it challenging to analyze. Whether it is actually out of equilibrium or not is a separate question.

I am not thrilled that we are obliged to invoke non-equilibrium effects in both theories. But there is a difference. Brada & Milgrom provided a quantitative criterion to indicate when this was an issue before I ran into the problem. In ΛCDM, the low velocity dispersions of objects like And XIX, XXI, XXV and Crater 2 came as a complete surprise despite having been predicted by MOND. Tidal disruption was only invoked after the fact – and in an ad hoc fashion. There is no way to know in advance which dwarfs are affected, as there is no criterion equivalent to that of Brada. We just say “gee, that’s a low velocity dispersion. Must have been disrupted.” That might be true, but it gives no explanation for why MOND predicted it in the first place – which is to say, it isn’t really an explanation at all.

I still do not understand is why MOND gets any predictions right if ΛCDM is the universe we live in, let alone so many. Shouldn’t happen. Makes no sense.

If this doesn’t confuse you, you are not thinking clearly.

*The other two dwarfs were also measured, but with only 4 stars in one and 6 in the other. These are too few for a meaningful velocity dispersion measurement.


Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

Dwarf Satellite Galaxies and Low Surface Brightness Galaxies in the Field. I.

The Milky Way and its nearest giant neighbor Andromeda (M31) are surrounded by a swarm of dwarf satellite galaxies. Aside from relatively large beasties like the Large Magellanic Cloud or M32, the majority of these are the so-called dwarf spheroidals. There are several dozen examples known around each giant host, like the Fornax dwarf pictured above.

Dwarf Spheroidal (dSph) galaxies are ellipsoidal blobs devoid of gas that typically contain a million stars, give or take an order of magnitude. Unlike globular clusters, that may have a similar star count, dSphs are diffuse, with characteristic sizes of hundreds of parsecs (vs. a few pc for globulars). This makes them among the lowest surface brightness systems known.

This subject has a long history, and has become a major industry in recent years. In addition to the “classical” dwarfs that have been known for decades, there have also been many comparatively recent discoveries, often of what have come to be called “ultrafaint” dwarfs. These are basically dSphs with luminosities less than 100,000 suns, sometimes being comprised of as little as a few hundred stars. New discoveries are being made still, and there is reason to hope that the LSST will discover many more. Summed up, the known dwarf satellites are proverbial drops in the bucket compared to their giant hosts, which contain hundreds of billions of stars. Dwarfs could rain in for a Hubble time and not perturb the mass budget of the Milky Way.

Nevertheless, tiny dwarf Spheroidals are excellent tests of theories like CDM and MOND. Going back to the beginning, in the early ’80s, Milgrom was already engaged in a discussion about the predictions of his then-new theory (before it was even published) with colleagues at the IAS, where he had developed the idea during a sabbatical visit. They were understandably skeptical, preferring – as many still do – to believe that some unseen mass was the more conservative hypothesis. Dwarf spheroidals came up even then, as their very low surface brightness meant low acceleration in MOND. This in turn meant large mass discrepancies. If you could measure their dynamics, they would have large mass-to-light ratios. Larger than could be explained by stars conventionally, and larger than the discrepancies already observed in bright galaxies like Andromeda.

This prediction of Milgrom’s – there from the very beginning – is important because of how things change (or don’t). At that time, Scott Tremaine summed up the contrasting expectation of the conventional dark matter picture:

“There is no reason to expect that dwarfs will have more dark matter than bright galaxies.” *

This was certainly the picture I had in my head when I first became interested in low surface brightness (LSB) galaxies in the mid-80s. At that time I was ignorant of MOND; my interest was piqued by the argument of Disney that there could be a lot of as-yet undiscovered LSB galaxies out there, combined with my first observing experiences with the then-newfangled CCD cameras which seemed to have a proclivity for making clear otherwise hard-to-see LSB features. At the time, I was interested in finding LSB galaxies. My interest in what made them rotate came  later.

The first indication, to my knowledge, that dSph galaxies might have large mass discrepancies was provided by Marc Aaronson in 1983. This tentative discovery was hugely important, but the velocity dispersion of Draco (one of the “classical” dwarfs) was based on only 3 stars, so was hardly definitive. Nevertheless, by the end of the ’90s, it was clear that large mass discrepancies were a defining characteristic of dSphs. Their conventionally computed M/L went up systematically as their luminosity declined. This was not what we had expected in the dark matter picture, but was, at least qualitatively, in agreement with MOND.

My own interests had focused more on LSB galaxies in the field than on dwarf satellites like Draco. Greg Bothun and Jim Schombert had identified enough of these to construct a long list of LSB galaxies that served as targets my for Ph.D. thesis. Unlike the pressure-supported ellipsoidal blobs of stars that are the dSphs, the field LSBs we studied were gas rich, rotationally supported disks – mostly late type galaxies (Sd, Sm, & Irregulars). Regardless of composition, gas or stars, low surface density means that MOND predicts low acceleration. This need not be true conventionally, as the dark matter can do whatever the heck it wants. Though I was blissfully unaware of it at the time, we had constructed the perfect sample for testing MOND.

Having studied the properties of our sample of LSB galaxies, I developed strong ideas about their formation and evolution. Everything we had learned – their blue colors, large gas fractions, and low star formation rates – suggested that they evolved slowly compared to higher surface brightness galaxies. Star formation gradually sputtered along, having a hard time gathering enough material to make stars in their low density interstellar media. Perhaps they even formed late, an idea I took a shining to in the early ’90s. This made two predictions: field LSB galaxies should be less strongly clustered than bright galaxies, and should spin slower at a given mass.

The first prediction follows because the collapse time of dark matter halos correlates with their larger scale environment. Dense things collapse first and tend to live in dense environments. If LSBs were low surface density because they collapsed late, it followed that they should live in less dense environments.

I didn’t know how to test this prediction. Fortunately, fellow postdoc and office mate in Cambridge at the time, Houjun Mo, did. It came true. The LSB galaxies I had been studying were clustered like other galaxies, but not as strongly. This was exactly what I expected, and I thought sure we were on to something. All that remained was to confirm the second prediction.

At the time, we did not have a clear idea of what dark matter halos should be like. NFW halos were still in the future. So it seemed reasonable that late forming halos should have lower densities (lower concentrations in the modern terminology). More importantly, the sum of dark and luminous density was certainly less. Dynamics follow from the distribution of mass as Velocity2 ∝ Mass/Radius. For a given mass, low surface brightness galaxies had a larger radius, by construction. Even if the dark matter didn’t play along, the reduction in the concentration of the luminous mass should lower the rotation velocity.

Indeed, the standard explanation of the Tully-Fisher relation was just this. Aaronson, Huchra, & Mould had argued that galaxies obeyed the Tully-Fisher relation because they all had essentially the same surface brightness (Freeman’s law) thereby taking variation in the radius out of the equation: galaxies of the same mass all had the same radius. (If you are a young astronomer who has never heard of Freeman’s law, you’re welcome.) With our LSB galaxies, we had a sample that, by definition, violated Freeman’s law. They had large radii for a given mass. Consequently, they should have lower rotation velocities.

Up to that point, I had not taken much interest in rotation curves. In contrast, colleagues at the University of Groningen were all about rotation curves. Working with Thijs van der Hulst, Erwin de Blok, and Martin Zwaan, we set out to quantify where LSB galaxies fell in relation to the Tully-Fisher relation. I confidently predicted that they would shift off of it – an expectation shared by many at the time. They did not.

The Tully-Fisher relation: disk mass vs. flat rotation speed (circa 1996). Galaxies are binned by surface brightness with the highest surface brightness galaxies marked red and the lowest blue. The lines show the expected shift following the argument of Aaronson et al. Contrary to this expectation, galaxies of all surface brightnesses follow the same Tully-Fisher relation.

I was flummoxed. My prediction was wrong. That of Aaronson et al. was wrong. Poking about the literature, everyone who had made a clear prediction in the conventional context was wrong. It made no sense.

I spent months banging my head against the wall. One quick and easy solution was to blame the dark matter. Maybe the rotation velocity was set entirely by the dark matter, and the distribution of luminous mass didn’t come into it. Surely that’s what the flat rotation velocity was telling us? All about the dark matter halo?

Problem is, we measure the velocity where the luminous mass still matters. In galaxies like the Milky Way, it matters quite a lot. It does not work to imagine that the flat rotation velocity is set by some property of the dark matter halo alone. What matters to what we measure is the combination of luminous and dark mass. The luminous mass is important in high surface brightness galaxies, and progressively less so in lower surface brightness galaxies. That should leave some kind of mark on the Tully-Fisher relation, but it doesn’t.

Residuals from the Tully-Fisher relation as a function of size at a given mass. Compact galaxies are to the left, diffuse ones to the right. The red dashed line is what Newton predicts: more compact galaxies should rotate faster at a given mass. Fundamental physics? Tully-Fisher don’t care. Tully-Fisher don’t give a sh*t.

I worked long and hard to understand this in terms of dark matter. Every time I thought I had found the solution, I realized that it was a tautology. Somewhere along the line, I had made an assumption that guaranteed that I got the answer I wanted. It was a hopeless fine-tuning problem. The only way to satisfy the data was to have the dark matter contribution scale up as that of the luminous mass scaled down. The more stretched out the light, the more compact the dark – in exact balance to maintain zero shift in Tully-Fisher.

This made no sense at all. Over twenty years on, I have yet to hear a satisfactory conventional explanation. Most workers seem to assert, in effect, that “dark matter does it” and move along. Perhaps they are wise to do so.

Working on the thing can drive you mad.

As I was struggling with this issue, I happened to hear a talk by Milgrom. I almost didn’t go. “Modified gravity” was in the title, and I remember thinking, “why waste my time listening to that nonsense?” Nevertheless, against my better judgement, I went. Not knowing that anyone in the audience worked on either LSB galaxies or Tully-Fisher, Milgrom proceeded to derive the MOND prediction:

“The asymptotic circular velocity is determined only by the total mass of the galaxy: Vf4 = a0GM.”

In a few lines, he derived rather trivially what I had been struggling to understand for months. The lack of surface brightness dependence in Tully-Fisher was entirely natural in MOND. It falls right out of the modified force law, and had been explicitly predicted over a decade before I struggled with the problem.

I scraped my jaw off the floor, determined to examine this crazy theory more closely. By the time I got back to my office, cognitive dissonance had already started to set it. Couldn’t be true. I had more pressing projects to complete, so I didn’t think about it again for many moons.

When I did, I decided I should start by reading the original MOND papers. I was delighted to find a long list of predictions, many of them specifically to do with surface brightness. We had just collected fresh data on LSB galaxies, which provided a new window on the low acceleration regime. I had the data to finally falsify this stupid theory.

Or so I thought. As I went through the list of predictions, my assumption that MOND had to be wrong was challenged by each item. It was barely an afternoon’s work: check, check, check. Everything I had struggled for months to understand in terms of dark matter tumbled straight out of MOND.

I was faced with a choice. I knew this would be an unpopular result. I could walk away and simply pretend I had never run across it. That’s certainly how it had been up until then: I had been blissfully unaware of MOND and its perniciously successful predictions. No need to admit otherwise.

Had I realized just how unpopular it would prove to be, maybe that would have been the wiser course. But even contemplating such a course felt criminal. I was put in mind of Paul Gerhardt’s admonition for intellectual honesty:

“When a man lies, he murders some part of the world.”

Ignoring what I had learned seemed tantamount to just that. So many predictions coming true couldn’t be an accident. There was a deep clue here; ignoring it wasn’t going to bring us closer to the truth. Actively denying it would be an act of wanton vandalism against the scientific method.

Still, I tried. I looked long and hard for reasons not to report what I had found. Surely there must be some reason this could not be so?

Indeed, the literature provided many papers that claimed to falsify MOND. To my shock, few withstood critical examination. Commonly a straw man representing MOND was falsified, not MOND itself. At a deeper level, it was implicitly assumed that any problem for MOND was an automatic victory for dark matter. This did not obviously follow, so I started re-doing the analyses for both dark matter and MOND. More often than not, I found either that the problems for MOND were greatly exaggerated, or that the genuinely problematic cases were a problem for both theories. Dark matter has more flexibility to explain outliers, but outliers happen in astronomy. All too often the temptation was to refuse to see the forest for a few trees.

The first MOND analysis of the classical dwarf spheroidals provides a good example. Completed only a few years before I encountered the problem, these were low surface brightness systems that were deep in the MOND regime. These were gas poor, pressure supported dSph galaxies, unlike my gas rich, rotating LSB galaxies, but the critical feature was low surface brightness. This was the most directly comparable result. Better yet, the study had been made by two brilliant scientists (Ortwin Gerhard & David Spergel) whom I admire enormously. Surely this work would explain how my result was a mere curiosity.

Indeed, reading their abstract, it was clear that MOND did not work for the dwarf spheroidals. Whew: LSB systems where it doesn’t work. All I had to do was figure out why, so I read the paper.

As I read beyond the abstract, the answer became less and less clear. The results were all over the map. Two dwarfs (Sculptor and Carina) seemed unobjectionable in MOND. Two dwarfs (Draco and Ursa Minor) had mass-to-light ratios that were too high for stars, even in MOND. That is, there still appeared to be a need for dark matter even after MOND had been applied. One the flip side, Fornax had a mass-to-light ratio that was too low for the old stellar populations assumed to dominate dwarf spheroidals. Results all over the map are par for the course in astronomy, especially for a pioneering attempt like this. What were the uncertainties?

Milgrom wrote a rebuttal. By then, there were measured velocity dispersions for two more dwarfs. Of these seven dwarfs, he found that

“within just the quoted errors on the velocity dispersions and the luminosities, the MOND M/L values for all seven dwarfs are perfectly consistent with stellar values, with no need for dark matter.”

Well, he would say that, wouldn’t he? I determined to repeat the analysis and error propagation.

Mass-to-light ratios determined with MOND for eight dwarf spheroidals (named, as published in McGaugh & de Blok 1998). The various symbols refer to different determinations. Mine are the solid circles. The dashed lines show the plausible range for stellar populations.

The net result: they were both right. M/L was still too high for Draco and Ursa Minor, and still too low for Fornax. But this was only significant at the 2σ level, if that – hardly enough to condemn a theory. Carina, Leo I, Leo II, Sculptor, and Sextans all had fairly reasonable mass-to-light ratios. The voting is different now. Instead of going 2 for 5 as Gerhard & Spergel found, MOND was now 5 for 8. One could choose to obsess about the outliers, or one could choose to see a more positive pattern.  Either a positive or a negative spin could be put on this result. But it was clearly more positive than the first attempt had indicated.

The mass estimator in MOND scales as the fourth power of velocity (or velocity dispersion in the case of isolated dSphs), so the too-high M*/L of Draco and Ursa Minor didn’t disturb me too much. A small overestimation of the velocity dispersion would lead to a large overestimation of the mass-to-light ratio. Just about every systematic uncertainty one can think of pushes in this direction, so it would be surprising if such an overestimate didn’t happen once in a while.

Given this, I was more concerned about the low M*/L of Fornax. That was weird.

Up until that point (1998), we had been assuming that the stars in dSphs were all old, like those in globular clusters. That corresponds to a high M*/L, maybe 3 in solar units in the V-band. Shortly after this time, people started to look closely at the stars in the classical dwarfs with the Hubble. Low and behold, the stars in Fornax were surprisingly young. That means a low M*/L, 1 or less. In retrospect, MOND was trying to tell us that: it returned a low M*/L for Fornax because the stars there are young. So what was taken to be a failing of the theory was actually a predictive success.


And Gee. This is a long post. There is a lot more to tell, but enough for now.

*I have a long memory, but it is not perfect. I doubt I have the exact wording right, but this does accurately capture the sentiment from the early ’80s when I was an undergraduate at MIT and Scott Tremaine was on the faculty there.

A brief history of the acceleration discrepancy

A brief history of the acceleration discrepancy

As soon as I wrote it, I realized that the title is much more general than anything that can be fit in a blog post. Bekenstein argued long ago that the missing mass problem should instead be called the acceleration discrepancy, because that’s what it is – a discrepancy that occurs in conventional dynamics at a particular acceleration scale. So in that sense, it is the entire history of dark matter. For that, I recommend the excellent book The Dark Matter Problem: A Historical Perspective by Bob Sanders.

Here I mean more specifically my own attempts to empirically constrain the relation between the mass discrepancy and acceleration. Milgrom introduced MOND in 1983, no doubt after a long period of development and refereeing. He anticipated essentially all of what I’m going to describe. But not everyone is eager to accept MOND as a new fundamental theory, and often suffer from a very human tendency to confuse fact and theory. So I have gone out of my way to demonstrate what is empirically true in the data – facts – irrespective of theoretical interpretation (MOND or otherwise).

What is empirically true, and now observationally established beyond a reasonable doubt, is that the mass discrepancy in rotating galaxies correlates with centripetal acceleration. The lower the acceleration, the more dark matter one appears to need. Or, as Bekenstein might have put it, the amplitude of the acceleration discrepancy grows as the acceleration itself declines.

Bob Sanders made the first empirical demonstration that I am aware of that the mass discrepancy correlates with acceleration. In a wide ranging and still relevant 1990 review, he showed that the amplitude of the mass discrepancy correlated with the acceleration at the last measured point of a rotation curve. It did not correlate with radius.

The acceleration discrepancy from Sanders (1990).

I was completely unaware of this when I became interested in the problem a few years later. I wound up reinventing the very same term – the mass discrepancy, which I defined as the ratio of dynamically measured mass to that visible in baryons: D = Mtot/Mbar. When there is no dark matter, Mtot = Mbar and D = 1.

My first demonstration of this effect was presented at a conference at Rutgers in 1998. This considered the mass discrepancy at every radius and every acceleration within all the galaxies that were available to me at that time. Though messy, as is often the case in extragalactic astronomy, the correlation was clear. Indeed, this was part of a broader review of galaxy formation; the title, abstract, and much of the substance remains relevant today.

The mass discrepancy – the ratio of dynamically measured mass to that visible in luminous stars and gas – as a function of centripetal acceleration. Each point is a measurement along a rotation curve; two dozen galaxies are plotted together. A constant mass-to-light ratio is assumed for all galaxies.

I spent much of the following five years collecting more data, refining the analysis, and sweating the details of uncertainties and systematic instrumental effects. In 2004, I published an extended and improved version, now with over 5 dozen galaxies.

One panel from Fig. 5 of McGaugh (2004). The mass discrepancy is plotted against the acceleration predicted by the baryons (in units of km2 s2 kpc-1).

Here I’ve used a population synthesis model to estimate the mass-to-light ratio of the stars. This is the only unknown; everything else is measured. Note that the vast majority galaxies land on top of each other. There are a few that do not, as you can perceive in the parallel sets of points offset from the main body. But that happens in only a few cases, as expected – no population model is perfect. Indeed, this one was surprisingly good, as the vast majority of the individual galaxies are indistinguishable in the pile that defines the main relation.

I explored the how the estimation of the stellar mass-to-light ratio affected this mass discrepancy-acceleration relation in great detail in the 2004 paper. The details differ with the choice of estimator, but the bottom line was that the relation persisted for any plausible choice. The relation exists. It is an empirical fact.

At this juncture, further improvement was no longer limited by rotation curve data, which is what we had been working to expand through the early ’00s. Now it was the stellar mass. The measurement of stellar mass was based on optical measurements of the luminosity distribution of stars in galaxies. These are perfectly fine data, but it is hard to map the starlight that we measured to the stellar mass that we need for this relation. The population synthesis models were good, but they weren’t good enough to avoid the occasional outlier, as can be seen in the figure above.

One thing the models all agreed on (before they didn’t, then they did again) was that the near-infrared would provide a more robust way of mapping stellar mass than the optical bands we had been using up till then. This was the clear way forward, and perhaps the only hope for improving the data further. Fortunately, technology was keeping pace. Around this time, I became involved in helping the effort to develop the NEWFIRM near-infrared camera for the national observatories, and NASA had just launched the Spitzer space telescope. These were the right tools in the right place at the right time. Ultimately, the high accuracy of the deep images obtained from the dark of space by Spitzer at 3.6 microns were to prove most valuable.

Jim Schombert and I spent much of the following decade observing in the near-infrared. Many other observers were doing this as well, filling the Spitzer archive with useful data while we concentrated on our own list of low surface brightness galaxies. This paragraph cannot suffice to convey the long term effort and enormity of this program. But by the mid-teens, we had accumulated data for hundreds of galaxies, including all those for which we also had rotation curves and HI observations. The latter had been obtained over the course of decades by an entire independent community of radio observers, and represent an integrated effort that dwarfs our own.

On top of the observational effort, Jim had been busy building updated stellar population models. We have a sophisticated understanding of how stars work, but things can get complicated when you put billions of them together. Nevertheless, Jim’s work – and that of a number of independent workers – indicated that the relation between Spitzer’s 3.6 micron luminosity measurements and stellar mass should be remarkably simple – basically just a constant conversion factor for nearly all star forming galaxies like those in our sample.

Things came together when Federico Lelli joined Case Western as a postdoc in 2014. He had completed his Ph.D. in the rich tradition of radio astronomy, and was the perfect person to move the project forward. After a couple more years of effort, curating the rotation curve data and building mass models from the Spitzer data, we were in the position to build the relation for over a dozen dozen galaxies. With all the hard work done, making the plot was a matter of running a pre-prepared computer script.

Federico ran his script. The plot appeared on his screen. In a stunned voice, he called me into his office. We had expected an improvement with the Spitzer data – hence the decade of work – but we had also expected there to be a few outliers. There weren’t. Any.

All. the. galaxies. fell. right. on. top. of. each. other.

The radial acceleration relation. The centripetal acceleration measured from rotation curves is plotted against that predicted by the observed baryons. 2693 points from 153 distinct galaxies are plotted together (bluescale); individual galaxies do not distinguish themselves in this plot. Indeed, the width of the scatter (inset) is entirely explicable by observational uncertainties and the expected scatter in stellar mass-to-light ratios. From McGaugh et al. (2016).

This plot differs from those above because we had decided to plot the measured acceleration against that predicted by the observed baryons so that the two axes would be independent. The discrepancy, defined as the ratio, depended on both. D is essentially the ratio of the y-axis to the x-axis of this last plot, dividing out the unity slope where D = 1.

This was one of the most satisfactory moments of my long career, in which I have been fortunate to have had many satisfactory moments. It is right up there with the eureka moment I had that finally broke the long-standing loggerhead about the role of selection effects in Freeman’s Law. (Young astronomers – never heard of Freeman’s Law? You’re welcome.) Or the epiphany that, gee, maybe what we’re calling dark matter could be a proxy for something deeper. It was also gratifying that it was quickly recognized as such, with many of the colleagues I first presented it to saying it was the highlight of the conference where it was first unveiled.

Regardless of the ultimate interpretation of the radial acceleration relation, it clearly exists in the data for rotating galaxies. The discrepancy appears at a characteristic acceleration scale, g = 1.2 x 10-10 m/s/s. That number is in the data. Why? is a deeply profound question.

It isn’t just that the acceleration scale is somehow fundamental. The amplitude of the discrepancy depends systematically on the acceleration. Above the critical scale, all is well: no need for dark matter. Below it, the amplitude of the discrepancy – the amount of dark matter we infer – increases systematically. The lower the acceleration, the more dark matter one infers.

The relation for rotating galaxies has no detectable scatter – it is a near-perfect relation. Whether this persists, and holds for other systems, is the interesting outstanding question. It appears, for example, that dwarf spheroidal galaxies may follow a slightly different relation. However, the emphasis here is on slighlty. Very few of these data pass the same quality criteria that the SPARC data plotted above do. It’s like comparing mud pies with diamonds.

Whether the scatter in the radial acceleration relation is zero or merely very tiny is important. That’s the difference between a new fundamental force law (like MOND) and a merely spectacular galaxy scaling relation. For this reason, it seems to be controversial. It shouldn’t be: I was surprised at how tight the relation was myself. But I don’t get to report that there is lots of scatter when there isn’t. To do so would be profoundly unscientific, regardless of the wants of the crowd.

Of course, science is hard. If you don’t do everything right, from the measurements to the mass models to the stellar populations, you’ll find some scatter where perhaps there isn’t any. There are so many creative ways to screw up that I’m sure people will continue to find them. Myself, I prefer to look forward: I see no need to continuously re-establish what has been repeatedly demonstrated in the history briefly outlined above.

Degenerating problemshift: a wedged paradigm in great tightness

Degenerating problemshift: a wedged paradigm in great tightness

Reading Merritt’s paper on the philosophy of cosmology, I was struck by a particular quote from Lakatos:

A research programme is said to be progressing as long as its theoretical growth anticipates its empirical growth, that is as long as it keeps predicting novel facts with some success (“progressive problemshift”); it is stagnating if its theoretical growth lags behind its empirical growth, that is as long as it gives only post-hoc explanations either of chance discoveries or of facts anticipated by, and discovered in, a rival programme (“degenerating problemshift”) (Lakatos, 1971, pp. 104–105).

The recent history of modern cosmology is rife with post-hoc explanations of unanticipated facts. The cusp-core problem and the missing satellites problem are prominent examples. These are explained after the fact by invoking feedback, a vague catch-all that many people agree solves these problems even though none of them agree on how it actually works.

Cartoon of the feedback explanation for the difference between the galaxy luminosity function (blue line) and the halo mass function (red line). From Silk & Mamon (2012).

There are plenty of other problems. To name just a few: satellite planes (unanticipated correlations in phase space), the emptiness of voids, and the early formation of structure  (see section 4 of Famaey & McGaugh for a longer list and section 6 of Silk & Mamon for a positive spin on our list). Each problem is dealt with in a piecemeal fashion, often by invoking solutions that contradict each other while buggering the principle of parsimony.

It goes like this. A new observation is made that does not align with the concordance cosmology. Hands are wrung. Debate is had. Serious concern is expressed. A solution is put forward. Sometimes it is reasonable, sometimes it is not. In either case it is rapidly accepted so long as it saves the paradigm and prevents the need for serious thought. (“Oh, feedback does that.”) The observation is no longer considered a problem through familiarity and exhaustion of patience with the debate, regardless of how [un]satisfactory the proffered solution is. The details of the solution are generally forgotten (if ever learned). When the next problem appears the process repeats, with the new solution often contradicting the now-forgotten solution to the previous problem.

This has been going on for so long that many junior scientists now seem to think this is how science is suppose to work. It is all they’ve experienced. And despite our claims to be interested in fundamental issues, most of us are impatient with re-examining issues that were thought to be settled. All it takes is one bold assertion that everything is OK, and the problem is perceived to be solved whether it actually is or not.

“Is there any more?”

That is the process we apply to little problems. The Big Problems remain the post hoc elements of dark matter and dark energy. These are things we made up to explain unanticipated phenomena. That we need to invoke them immediately casts the paradigm into what Lakatos called degenerating problemshift. Once we’re there, it is hard to see how to get out, given our propensity to overindulge in the honey that is the infinity of free parameters in dark matter models.

Note that there is another aspect to what Lakatos said about facts anticipated by, and discovered in, a rival programme. Two examples spring immediately to mind: the Baryonic Tully-Fisher Relation and the Radial Acceleration Relation. These are predictions of MOND that were unanticipated in the conventional dark matter picture. Perhaps we can come up with post hoc explanations for them, but that is exactly what Lakatos would describe as degenerating problemshift. The rival programme beat us to it.

In my experience, this is a good description of what is going on. The field of dark matter has stagnated. Experimenters look harder and harder for the same thing, repeating the same experiments in hope of a different result. Theorists turn knobs on elaborate models, gifting themselves new free parameters every time they get stuck.

On the flip side, MOND keeps predicting novel facts with some success, so it remains in the stage of progressive problemshift. Unfortunately, MOND remains incomplete as a theory, and doesn’t address many basic issues in cosmology. This is a different kind of unsatisfactory.

In the mean time, I’m still waiting to hear a satisfactory answer to the question I’ve been posing for over two decades now. Why does MOND get any predictions right? It has had many a priori predictions come true. Why does this happen? It shouldn’t. Ever.

Cosmology and Convention (continued)

Cosmology and Convention (continued)
Note: this is a guest post by David Merritt, following on from his paper on the philosophy of science as applied to aspects of modern cosmology.

Stacy kindly invited me to write a guest post, expanding on some of the arguments in my paper. I’ll start out by saying that I certainly don’t think of my paper as a final word on anything. I see it more like an opening argument — and I say this, because it’s my impression that the issues which it raises have not gotten nearly the attention they deserve from the philosophers of science. It is that community that I was hoping to reach, and that fact dictated much about the content and style of the paper. Of course, I’m delighted if astrophysicists find something interesting there too.

My paper is about epistemology, and in particular, whether the standard cosmological model respects Popper’s criterion of falsifiability — which he argued (quite convincingly) is a necessary condition for a theory to be considered scientific. Now, falsifying a theory requires testing it, and testing it means (i) using the theory to make a prediction, then (ii) checking to see if the prediction is correct. In the case of dark matter, the cleanest way I could think of to do this was via so-called  “direct detection”, since the rotation curve of the Milky Way makes a pretty definite prediction about the density of dark matter at the Sun’s location. (Although as I argued, even this is not enough, since the theory says nothing at all about the likelihood that the DM particles will interact with normal matter even if they are present in a detector.)

What about the large-scale evidence for dark matter — things like the power spectrum of density fluctuations, baryon acoustic oscillations, the CMB spectrum etc.? In the spirit of falsification, we can ask what the standard model predicts for these things; and the answer is: it does not make any definite prediction. The reason is that — to predict quantities like these — one needs first to specify the values of a set of additional parameters: things like the mean densities of dark and normal matter; the numbers that determine the spectrum of initial density fluctuations; etc. There are roughly half a dozen such “free parameters”. Cosmologists never even try to use data like these to falsify their theory; their goal is to make the theory work, and they do this by picking the parameter values that optimize the fit between theory and data.

Philosophers of science are quite familiar with this sort of thing, and they have a rule: “You can’t use the data twice.” You can’t use data to adjust the parameters of a theory, and then turn around and claim that those same data support the theory.  But this is exactly what cosmologists do when they argue that the existence of a “concordance model” implies that the standard cosmological model is correct. What “concordance” actually shows is that the standard model can be made consistent: i.e. that one does not require different values for the same parameter. Consistency is good, but by itself it is a very weak argument in favor of a theory’s correctness. Furthermore, as Stacy has emphasized, the supposed “concordance” vanishes when you look at the values of the same parameters as they are determined in other, independent ways. The apparent tension in the Hubble constant is just the latest example of this; another, long-standing example is the very different value for the mean baryon density implied by the observed lithium abundance. There are other examples. True “convergence” in the sense understood by the philosophers — confirmation of the value of a single parameter in multiple, independent experiments — is essentially lacking in cosmology.

Now, even though those half-dozen parameters give cosmologists a great deal of freedom to adjust their model and to fit the data, the freedom is not complete. This is because — when adjusting parameters — they fix certain things: what Imre Lakatos called the “hard core” of a research program: the assumptions that a theorist is absolutely unwilling to abandon, come hell or high water. In our case, the “hard core” includes Einstein’s theory of gravity, but it also includes a number of less-obvious things; for instance, the assumption that the dark matter responds to gravity in the same way as any collisionless fluid of normal matter would respond. (The latter assumption is not made in many alternative theories.) Because of the inflexibility of the “hard core”, there are going to be certain parameter values that are also more-or-less fixed by the data. When a cosmologist says “The third peak in the CMB requires dark matter”, what she is really saying is: “Assuming the fixed hard core, I find that any reasonable fit to the data requires the parameter defining the dark-matter density to be significantly greater than zero.” That is a much weaker statement than “Dark matter must exist”. Statements like “We know that dark matter exists” put me in mind of the 18th century chemists who said things like “Based on my combustion experiments, I conclude that phlogiston exists and that it has a negative mass”. We know now that the behavior the chemists were ascribing to the release of phlogiston was actually due to oxidation. But the “hard core” of their theory (“Combustibles contain an inflammable principle which they release upon burning”) forbade them from considering different models. It took Lavoisier’s arguments to finally convince them of the existence of oxygen.

The fact that the current cosmological model has a fixed “hard core” also implies that — in principle — it can be falsified. But, at the risk of being called a cynic, I have little doubt that if a new, falsifying observation should appear, even a very compelling one, the community will respond as it has so often in the past: via a conventionalist stratagem. Pavel Kroupa has a wonderful graphic, reproduced below, that shows just how often predictions of the standard cosmological model have been falsified — a couple of dozen times, according to latest count; and these are only the major instances. Historians and philosophers of science have documented that theories that evolve in this way often end up on the scrap heap. To the extent that my paper is of interest to the astronomical community, I hope that it gets people to thinking about whether the current cosmological model is headed in that direction.

Fig. 14 from Kroupa (2012) quantifying setbacks to the Standard Model of Cosmology (SMoC).



LCDM has met the enemy, and it is itself

LCDM has met the enemy, and it is itself

David Merritt recently published the article “Cosmology and convention” in Studies in History and Philosophy of Science. This article is remarkable in many respects. For starters, it is rare that a practicing scientist reads a paper on the philosophy of science, much less publishes one in a philosophy journal.

I was initially loathe to start reading this article, frankly for fear of boredom: me reading about cosmology and the philosophy of science is like coals to Newcastle. I could not have been more wrong. It is a genuine page turner that should be read by everyone interested in cosmology.

I have struggled for a long time with whether dark matter constitutes a falsifiable scientific hypothesis. It straddles the border: specific dark matter candidates (e.g., WIMPs) are confirmable – a laboratory detection is both possible and plausible – but the concept of dark matter can never be excluded. If we fail to find WIMPs in the range of mass-cross section parameters space where we expected them, we can change the prediction. This moving of the goal post has already happened repeatedly.

The cross-section vs. mass parameter space for WIMPs. The original, “natural” weak interaction cross-section (10-39) was excluded long ago, as were early attempts to map out the theoretically expected parameter space (upper pink region). Later predictions drifted to progressively lower cross-sections. These evaded experimental limits at the time, and confident predictions were made that the dark matter would be found.  More recent data show otherwise: the gray region is excluded by PandaX (2016). [This plot was generated with the help of DMTools hosted at Brown.]
I do not find it encouraging that the goal posts keep moving. This raises the question, how far can we go? Arbitrarily low cross-sections can be extracted from theory if we work at it hard enough. How hard should we work? That is, what criteria do we set whereby we decide the WIMP hypothesis is mistaken?

There has to be some criterion by which we would consider the WIMP hypothesis to be falsified. Without such a criterion, it does not satisfy the strictest definition of a scientific hypothesis. If at some point we fail to find WIMPs and are dissatisfied with the theoretical fine-tuning required to keep them hidden, we are free to invent some other dark matter candidate. No WIMPs? Must be axions. Not axions? Would you believe light dark matter? [Worst. Name. Ever.] And so on, ad infinitum. The concept of dark matter is not falsifiable, even if specific dark matter candidates are subject to being made to seem very unlikely (e.g., brown dwarfs).

Faced with this situation, we can consult the philosophy science. Merritt discusses how many of the essential tenets of modern cosmology follow from what Popper would term “conventionalist stratagems” – ways to dodge serious consideration that a treasured theory is threatened. I find this a compelling terminology, as it formalizes an attitude I have witnessed among scientists, especially cosmologists, many times. It was put more colloquially by J.K. Galbraith:

“Faced with the choice between changing one’s mind and proving that there is no need to do so, almost everybody gets busy on the proof.”

Boiled down (Keuth 2005), the conventionalist strategems Popper identifies are

  1. ad hoc hypotheses
  2. modification of ostensive definitions
  3. doubting the reliability of the experimenter
  4. doubting the acumen of the theorist

These are stratagems to be avoided according to Popper. At the least they are pitfalls to be aware of, but as Merritt discusses, modern cosmology has marched down exactly this path, doing each of these in turn.

The ad hoc hypotheses of ΛCDM are of course Λ and CDM. Faced with the observation of a metric that cannot be reconciled with the prior expectation of a decelerating expansion rate, we re-invoke Einstein’s greatest blunder, Λ. We even generalize the notion and give it a fancy new name, dark energy, which has the convenient property that it can fit any observed set of monotonic distance-redshift pairs. Faced with an excess of gravitational attraction over what can be explained by normal matter, we invoke non-baryonic dark matter: some novel form of mass that has no place in the standard model of particle physics, has yet to show any hint of itself in the laboratory, and cannot be decisively excluded by experiment.

We didn’t accept these ad hoc add-ons easily or overnight. Persuasive astronomical evidence drove us there, but all these data really show is that something dire is wrong: General Relativity plus known standard model particles cannot explain the universe. Λ and CDM are more a first guess than a final answer. They’ve been around long enough that they have become familiar, almost beyond doubt. Nevertheless, they remain unproven ad hoc hypotheses.

The sentiment that is often asserted is that cosmology works so well that dark matter and dark energy must exist. But a more conservative statement would be that our present understanding of cosmology is correct if and only if these dark entities exist. The onus is on us to detect dark matter particles in the laboratory.

That’s just the first conventionalist stratagem. I could given many examples of violations of the other three, just from my own experience. That would make for a very long post indeed.

Instead, you should go read Merritt’s paper. There are too many things there to discuss, at least in a single post. You’re best going to the source. Be prepared for some cognitive dissonance.


Crater 2: the Bullet Cluster of LCDM

Crater 2: the Bullet Cluster of LCDM

Recently I have been complaining about the low standards to which science has sunk. It has become normal to be surprised by an observation, express doubt about the data, blame the observers, slowly let it sink in, bicker and argue for a while, construct an unsatisfactory model that sort-of, kind-of explains the surprising data but not really, call it natural, then pretend like that’s what we expected all along. This has been going on for so long that younger scientists might be forgiven if they think this is how science is suppose to work. It is not.

At the root of the scientific method is hypothesis testing through prediction and subsequent observation. Ideally, the prediction comes before the experiment. The highest standard is a prediction made before the fact in ignorance of the ultimate result. This is incontrovertibly superior to post-hoc fits and hand-waving explanations: it is how we’re suppose to avoid playing favorites.

I predicted the velocity dispersion of Crater 2 in advance of the observation, for both ΛCDM and MOND. The prediction for MOND is reasonably straightforward. That for ΛCDM is fraught. There is no agreed method by which to do this, and it may be that the real prediction is that this sort of thing is not possible to predict.

The reason it is difficult to predict the velocity dispersions of specific, individual dwarf satellite galaxies in ΛCDM is that the stellar mass-halo mass relation must be strongly non-linear to reconcile the steep mass function of dark matter sub-halos with their small observed numbers. This is closely related to the M*-Mhalo relation found by abundance matching. The consequence is that the luminosity of dwarf satellites can change a lot for tiny changes in halo mass.

Fig. 11 from Tollerud et al. (2011, ApJ, 726, 108). The width of the bands illustrates the minimal scatter expected between dark halo and measurable properties. A dwarf of a given luminosity could reside in dark halos differing be two decades in mass, with a corresponding effect on the velocity dispersion.

Long story short, the nominal expectation for ΛCDM is a lot of scatter. Photometrically identical dwarfs can live in halos with very different velocity dispersions. The trend between mass, luminosity, and velocity dispersion is so weak that it might barely be perceptible. The photometric data should not be predictive of the velocity dispersion.

It is hard to get even a ballpark answer that doesn’t make reference to other measurements. Empirically, there is some correlation between size and velocity dispersion. This “predicts” σ = 17 km/s. That is not a true theoretical prediction; it is just the application of data to anticipate other data.

Abundance matching relations provide a highly uncertain estimate. The first time I tried to do this, I got unphysical answers (σ = 0.1 km/s, which is less than the stars alone would cause without dark matter – about 0.5 km/s). The application of abundance matching requires extrapolation of fits to data at high mass to very low mass. Extrapolating the M*-Mhalo relation over many decades in mass is very sensitive to the low mass slope of the fitted relation, so it depends on which one you pick.


Since my first pick did not work, lets go with the value suggested to me by James Bullock: σ = 11 km/s. That is the mid-value (the blue lines in the figure above); the true value could easily scatter higher or lower. Very hard to predict with any precision. But given the luminosity and size of Crater 2, we expect numbers like 11 or 17 km/s.

The measured velocity dispersion is σ = 2.7 ± 0.3 km/s.

This is incredibly low. Shockingly so, considering the enormous size of the system (1 kpc half light radius). The NFW halos predicted by ΛCDM don’t do that.

To illustrate how far off this is, I have adopted this figure from Boylan-Kolchin et al. (2012).

Fig. 1 of MNRAS, 422, 1203 illustrating the “too big to fail” problem: observed dwarfs have lower velocity dispersions than sub-halos that must exist and should host similar or even more luminous dwarfs that apparently do not exist. I have had to extend the range of the original graph to lower velocities in order to include Crater 2.

Basically, NFW halos, including the sub-halos imagined to host dwarf satellite galaxies, have rotation curves that rise rapidly and stay high in proportion to the cube root of the halo mass. This property makes it very challenging to explain a low velocity at a large radius: exactly the properties observed in Crater 2.

Lets not fail to appreciate how extremely wrong this is. The original version of the graph above stopped at 5 km/s. It didn’t extend to lower values because they were absurd. There was no reason to imagine that this would be possible. Indeed, the point of their paper was that the observed dwarf velocity dispersions were already too low. To get to lower velocity, you need an absurdly low mass sub-halo – around 107 M. In contrast, the usual inference of masses for sub-halos containing dwarfs of similar luminosity is around 109 Mto 1010 M. So the low observed velocity dispersion – especially at such a large radius – seems nigh on impossible.

More generally, there is no way in ΛCDM to predict the velocity dispersions of particular individual dwarfs. There is too much intrinsic scatter in the highly non-linear relation between luminosity and halo mass. Given the photometry, all we can say is “somewhere in this ballpark.” Making an object-specific prediction is impossible.

Except that it is possible. I did it. In advance.

The predicted velocity dispersion is σ = 2.1 +0.9/-0.6 km/s.

I’m an equal opportunity scientist. In addition to ΛCDM, I also considered MOND. The successful prediction is that of MOND. (The quoted uncertainty reflects the uncertainty in the stellar mass-to-light ratio.) The difference is that MOND makes a specific prediction for every individual object. And it comes true. Again.

MOND is a funny theory. The amplitude of the mass discrepancy it induces depends on how low the acceleration of a system is. If Crater 2 were off by itself in the middle of intergalactic space, MOND would predict it should have a velocity dispersion of about 4 km/s.

But Crater 2 is not isolated. It is close enough to the Milky Way that there is an additional, external acceleration imposed by the Milky Way. The net result is that the acceleration isn’t quite as low as it would be were Crater 2 al by its lonesome. Consequently, the predicted velocity dispersion is a measly 2 km/s. As observed.

In MOND, this is called the External Field Effect (EFE). Theoretically, the EFE is rather disturbing, as it breaks the Strong Equivalence Principle. In particular, Local Position Invariance in gravitational experiments is violated: the velocity dispersion of a dwarf satellite depends on whether it is isolated from its host or not. Weak equivalence (the universality of free fall) and the Einstein Equivalence Principle (which excludes gravitational experiments) may still hold.

We identified several pairs of photometrically identical dwarfs around Andromeda. Some are subject to the EFE while others are not. We see the predicted effect of the EFE: isolated dwarfs have higher velocity dispersions than their twins afflicted by the EFE.

If it is just a matter of sub-halo mass, the current location of the dwarf should not matter. The velocity dispersion certainly should not depend on the bizarre MOND criterion for whether a dwarf is affected by the EFE or not. It isn’t a simple distance-dependency. It depends on the ratio of internal to external acceleration. A relatively dense dwarf might still behave as an isolated system close to its host, while a really diffuse one might be affected by the EFE even when very remote.

When Crater 2 was first discovered, I ground through the math and tweeted the prediction. I didn’t want to write a paper for just one object. However, I eventually did so because I realized that Crater 2 is important as an extreme example of a dwarf so diffuse that it is affected by the EFE despite being very remote (120 kpc from the Milky Way). This is not easy to reproduce any other way. Indeed, MOND with the EFE is the only way that I am aware of whereby it is possible to predict, in advance, the velocity dispersion of this particular dwarf.

If I put my ΛCDM hat back on, it gives me pause that any method can make this prediction. As discussed above, this shouldn’t be possible. There is too much intrinsic scatter in the halo mass-luminosity relation.

If we cook up an explanation for the radial acceleration relation, we still can’t make this prediction. The RAR fit we obtained empirically predicts 4 km/s. This is indistinguishable from MOND for isolated objects. But the RAR itself is just an empirical law – it provides no reason to expect deviations, nor how to predict them. MOND does both, does it right, and has done so before, repeatedly. In contrast, the acceleration of Crater 2 is below the minimum allowed in ΛCDM according to Navarro et al.

For these reasons I consider Crater 2 to be the bullet cluster of ΛCDM. Just as the bullet cluster seems like a straight-up contradiction to MOND, so too does Crater 2 for ΛCDM. It is something ΛCDM really can’t do. The difference is that you can just look at the bullet cluster. With Crater 2 you actually have to understand MOND as well as ΛCDM, and think it through.

So what can we do to save ΛCDM?

Whatever it takes, per usual.

One possibility is that Crater II may represent the “bright” tip of the extremely low surface brightness “stealth” fossils predicted by Bovill & Ricotti. Their predictions are encouraging for getting the size and surface brightness in the right ballpark. But I see no reason in this context to expect such a low velocity dispersion. They anticipate dispersions consistent with the ΛCDM discussion above, and correspondingly high mass-to-light ratios that are greater than observed for Crater 2 (M/L ≈ 104 rather than ~50).

plausible suggestion I heard was from James Bullock. While noting that reionization should preclude the existence of galaxies in halos below 5 km/s, as we need for Crater 2, he suggested that tidal stripping could reduce an initially larger sub-halo to this point. I am dubious about this, as my impression from the simulations of Penarrubia  was that the outer regions of the sub-halo were stripped first while leaving the inner regions (where the NFW cusp predicts high velocity dispersions) largely intact until near complete dissolution. In this context, it is important to bear in mind that the low velocity dispersion of Crater 2 is observed at large radii (1 kpc, not tens of pc). Still, I can imagine ways in which this might be made to work in this particular case, depending on its orbit. Tony Sohn has an HST program to measure the proper motion; this should constrain whether the object has ever passed close enough to the center of the Milky Way to have been tidally disrupted.

Josh Bland-Hawthorn pointed out to me that he made simulations that suggest a halo with a mass as low as 107 Mcould make stars before reionization and retain them. This contradicts much of the conventional wisdom outlined above because they find a much lower (and in my opinion, more realistic) feedback efficiency for supernova feedback than assumed in most other simulations. If this is correct (as it may well be!) then it might explain Crater 2, but it would wreck all the feedback-based explanations given for all sorts of other things in ΛCDM, like the missing satellite problem and the cusp-core problem. We can’t have it both ways.

Without super-efficient supernova feedback, the Local Group would be filled with a million billion ultrafaint dwarf galaxies!

I’m sure people will come up with other clever ideas. These will inevitably be ad hoc suggestions cooked up in response to a previously inconceivable situation. This will ring hollow to me until we explain why MOND can predict anything right at all.

In the case of Crater 2, it isn’t just a matter of retrospectively explaining the radial acceleration relation. One also has to explain why exceptions to the RAR occur following the very specific, bizarre, and unique EFE formulation of MOND. If I could do that, I would have done so a long time ago.

No matter what we come up with, the best we can hope to do is a post facto explanation of something that MOND predicted correctly in advance. Can that be satisfactory?