Research Highlights

Novel Methods in Machine Learning and Statistics for Challenges in Cosmology:

Towards an Optimal Estimation of Cosmological Parameters with the Wavelet Scattering Transform: Optimal extraction of the non-Gaussian information encoded in the Large-Scale Structure (LSS) of the universe lies at the forefront of modern precision cosmology. In this work, we propose achieving this task through the use of the Wavelet Scattering Transform (WST), which subjects an input field to a layer of non-linear transformations that are sensitive to non-Gaussianity in spatial density distributions through a generated set of WST coefficients. In order to assess its applicability in the context of LSS surveys, we apply the WST on the 3D overdensity field obtained by the Quijote simulations. We find a large improvement in the marginalized errors on all cosmological parameters, ranging between 1.2 - 4x tighter than the corresponding ones obtained from the regular 3D cold dark matter + baryon power spectrum, as well as a 50% improvement over the neutrino mass constraint given by the marked power spectrum. Through this first application on 3D cosmological fields, we demonstrate the great promise held by this statistic and set the stage for its future application to actual galaxy observations. 

Precise and Accurate Cosmology with CMBxLSS Power Spectra and Bispectra: With the advent of a new generation of cosmological experiments, it is of paramount importance to exploit the full potential of joint analyses of multiple cosmological probes. In this work with my PhD student Shu-Fan Chen and my postdoc Hayden Lee, we study the cosmological information content contained in the one-loop power spectra and tree bispectra of galaxies cross-correlated with CMB lensing. We use the FFTLog method to compute angular correlations in spherical harmonic space, applicable for wide angles that can be accessed by forthcoming galaxy surveys, going beyond the usual Limber approximation. We find that adding the bispectra and cross-correlations with CMB lensing offers a significant improvement in parameter constraints, including those on the total neutrino mass, Mν, and local non-Gaussianity amplitude, fNL.  In particular, our results suggest that the combination of the Vera Rubin Observatory's Legacy Survey of Space and Time (LSST) and CMB-S4 will be able to achieve σ(Mν)=42 meV from galaxy and CMB lensing correlations, and σ(Mν)=12 meV when further combined with the CMB temperature and polarization data, making it possible to distinguish between neutrino hierarchies without any prior on the optical depth. 

• Efficient method for computing cosmological four-point correlations: Angular cosmological 
   correlators are infamously difficult to compute. Together with my postdoc Hayden Lee we  
    introduced a method to compute in a fast and reliable way the angular galaxy trispectrum at tree 
   level, with and without primordial non-Gaussianity, as well as  the non-Gaussian covariance of the 
   angular matter power spectrum, beyond the Limber approximation, applying the FFTLog 
   algorithm. In the era of high-precision cosmology and large datasets, it is imperative to build 
   efficient algorithms for calculating and estimating cosmological observables.

Flow-based likelihoods for Non-Gaussian inference: in this work with my PhD student Ana   
  Diaz Rivero we suggest using a data-driven likelihood that we call flow-based likelihood (FBL) to 
  deal with known (or suspected) non-Gaussianities (NG) in data sets. FBLs are the optimization 
  targets of flow-based generative models, a class of models that can capture complex distributions 
  by transforming a simple base distribution through layers of nonlinearities. We point out that this is 
  more accurate than other methods used previously to deal with NG in data sets. We apply FBLs to 
  mock weak lensing convergence power spectra and find that the FBL captures the NG signatures 
  in the data extremely well, while other commonly-used data-driven likelihoods, such as Gaussian 
  mixture models and independent component analysis, fail to do so. Unlike other methods, the 
  flexibility of the FBL makes it successful at tackling different types of NG simultaneously. Because 
  of this, and consequently their likely applicability across datasets and domains, we encourage their 
  use for inference when sufficient mock data are available for training.

Detecting Subhalos in Strong Gravitational Lens Images with Image Segmentation: in this 
  two works with my group, we use machine learning to circumvent the need for lens and source 
  modeling and develop a method to both locate subhalos in an image as well as determine their 
  mass using the technique of image segmentation. The network is trained on images with a single 
  subhalo located near the Einstein ring. Training in this way allows the network to learn the  
  gravitational lensing of light and it is then able to accurately detect entire populations of  
  substructure, even far from the Einstein ring. With good accuracy and a low false-positive rate, 
  counting the number of pixels assigned to each subhalo class over multiple images allows for a 
  measurement of the subhalo mass function (SMF). When measured over five mass bins from
  108 to 1010 Msun, the SMF slope is recovered with an error of 14.2 (16.3)% for 10 images, and this 
  improves to 2.1 (2.6)% for 1000 images without (with HST-like) noise. 

First attempt to infer the presence of dark matter substructure in strong lens images with a binary classifier, without having to do any intermediate lens modeling, using Convolutional Neural Network (CNN): together with my Ph.D. student Ana Diaz Rivero we trained a CNN to classify images based on whether they have detectable substructure or not. Tens of thousands of new lenses are expected to become available in the near future. The new and fast approach to analyze strong lens images proposed in this work is more suited to this new era of large data sets.

A novel technique for Cosmic Microwave Background foreground subtraction: together with 
  former undergraduate students at Harvard, Sebastian Wagner-Carena and Max Hopkins, and  
  current Ph.D. student Ana Diaz Rivero, we introduced a Bayesian hierarchical framework for  
  source separation. We find improved performance of our algorithm when compared to state-of-the-
  art Internal Linear Combination (ILC)-type algorithms under various metrics: the root mean square 
  error of the residual between the reconstructed CMB and the input CMB maps, the cross-power 
  spectrum between the residual map and the foregrounds, and the difference between the power 
  spectra of the input CMB and the reconstructed CMB. Our results open a new avenue for
  constructing CMB maps through Bayesian hierarchical analysis. This algorithm has been built to 
  tackle one of the principal challenges in precision measurements of ISW signals, gravitational 
  lensing, primordial non-Gaussianity, constraints on isotropy, etc.

Dark Matter, Light Relics, and Neutrinos:

  Some of my work to shed light in the dark sector includes:

Finding eV-scale Light Relics with Cosmological Observables: Relics with masses on the eV scale (meV-10 eV) become non-relativistic before today, and thus behave as matter instead of radiation. This leaves an imprint in the clustering of the large-scale structure of the universe, as light relics have important streaming motions. In recent work with my group, we studied how well current and upcoming cosmological surveys can probe light massive relics (LiMRs). We considered minimal extensions to the Standard Model (SM) by both fermionic and bosonic relic degrees of freedom. We found that a very large coverage of parameter space will be attainable by upcoming CMB and LSS experiments, opening the possibility of exploring uncharted territory for new physics beyond the SM.
  In this work, we present the first general search for LiMRs with CMB, weak-lensing, and full 
  shape galaxy data. We demonstrate that weak-lensing data is critical for breaking parameter 
  degeneracies, while full-shape information presents a significant boost in constraining  
  power. Our constraints are the tightest and most comprehensive to date for scalars, Weyl-fermions,   
  Dirac fermions, and vectors.

Accurately Weighting Neutrinos with Cosmological Surveys: in this work with my group we found, through an MCMC likelihood analysis of future CMB and LSS data sets, that upcoming  surveys will be able to “distinguish” neutrino hierarchies at the 1 sigma level. We further found that neglecting the effect of a growth-induced scale-dependent bias of halos produced by neutrino mass (studied in Muñoz & Dvorkin, 2018) can induce up to a 1 sigma overestimation of the total neutrino mass. We showed how to absorb this effect via a redshift-dependent parametrization of the scale-independent bias. To facilitate future data analyses, we released RelicCLASS: a publicly available code to compute CMB and LSS observables in the presence of massive neutrinos or any light relic.  

Line-of-sight halo contribution to the dark matter convergence power spectrum from strong gravitational lenses: in this work with my group we studied a novel observable: the contribution of the line-of-sight (LOS) halos to the convergence power spectrum. We showed that it is possible to define an effective convergence for multi-plane lensing systems with a dominant main lens coupled to lower-mass interlopers, and we tested our analytical results with mock lensing simulations obtained by doing ray tracing with the multi-lens plane equation, finding excellent agreement. We find that the LOS halo contribution can be significantly larger than the one from subhalos for many of the well-known systems in the literature (see here for an interactive version with different lens and source redshift, as well as different dark matter (DM) mass in substructure). Since the halo mass function is better understood from first principles, the dominance of interlopers in galaxy-galaxy lenses can be seen as a significant advantage when translating this observable into a constraint on DM. Furthermore, it is crucial to take the LOS contribution into account before making any claim about DM; otherwise, we risk wrongfully falsifying or reinforcing the standard LCDM scenario.

Model-agnostic probe of dark matter at small scales using 21-cm data: in this work with my 
  group we studied how upcoming 21-cm measurements during cosmic dawn provide a powerful
  handle on the small-scale structure of our universe. Using both the 21-cm global signal and its 
  fluctuations, we performed a principal component (PC) analysis to obtain model-agnostic  
  constraints on the matter power spectrum, showing that they are mostly sensitive to wavenumbers 
  k ~ 40 - 80 Mpc-1, which are currently unobserved scales. We have found that the 21-cm global 
  signal allows us to measure 2 PCs with signal-to-noise ratios larger than five. The 21-cm  
  fluctuations, on the other hand, allow for 3,  4, and 5  PCs to be measured under the assumption of 
  pessimistic, moderate, and optimistic foregrounds. We projected several non-CDM models onto 
  our PCs, finding that the 21-cm signal during cosmic dawn can improve the constraints on all of  
  these models over other current cosmic probes, such as the Lyman-alpha forest.

New Formalism for dark matter (DM) substructure statistics proposed to discern among different DM scenarios: in recent work with my group (published in PRD), I developed a general formalism to compute from first principles the projected mass density (convergence) power spectrum of the substructure in galactic halos under different populations of dark matter subhalos. We constructed a halo model-based formalism, computing the 1-subhalo and the 2-subhalo terms from first principles for the first time. We found that the asymptotic slope of the substructure power spectrum at large wavenumber reflects the internal density profile of the subhalos, and proposed this as a key observable to discern between different dark matter scenarios. 

  In subsequent work (published in PRD), we applied our formalism to N-body simulations and found 
  agreement with our predictions. Furthermore, we found that even at lower wavenumbers we can 
  gain important information about dark matter. Comparing the amplitude and slope of the power 
  spectrum on scales in the reach of current observations from lenses at different redshifts can help 
  us distinguish between cold dark matter and other dark matter scenarios.

New channel for DM Freeze-in: together with Katelin Schutz and Tongyan Lin, I identified an additional  production channel for DM produced through the freeze-in mechanism: the decay of photons that acquire an in-medium plasma mass (published in PRD). These plasmon decays are a dominant channel for DM production for sub-MeV DM masses, and including this channel leads to a significant reduction in the predicted signal strength for DM searches. The DM acquires a highly non-thermal phase space distribution, which impacts the cosmology at later times. This work was an Editors’ Suggestion in PRD.

The Cosmology of sub-MeV Dark Matter Freeze-In: In our previous work, we realized that dark 
   matter could be made from decaying photons in the early Universe. This mechanism for making 
   dark matter sits at the nexus of a lot of interesting properties: it's one of very few allowed ways of 
   making dark matter lighter than electrons from thermal processes, it's the simplest allowed way to 
   make dark matter with a small effective electric charge from a thermal process, and it's also 
   testable in the laboratory with some proposed experiments. 
   In this work, we tested with cosmological observables whether or not this is the dark matter of our 
   universe. If dark matter interacts with photons per its production mechanism, that also implies that 
   dark matter and baryonic matter can interact at a small level. In this case, the photons can drag 
   the dark matter (via the baryons). That means that our universe looks less clumpy. We tested this    
   idea using data using CMB data from the Planck satellite. Because the dark matter is made off 
   decaying photons, this means that the dark matter is born relativistic. By looking at the statistics of 
   dark matter clumps, we can infer whether or not their growth was hindered. We looked for small 
   galaxies with the DES survey, we used gravitational lensing, and also stellar streams. We tested a 
   wide range of parameter space. So far, we excluded some of the parameters. In the near future, 
   we expect to be able to test this idea over an even wider range of parameters.

Strongest constraints to date on dark matter-baryon scattering implying that a baryon in the halo of a galaxy like the Milky Way cannot scatter from DM particles in the history of the universe: the shrinking of the canonical-WIMP parameter space from null LHC and direct searches, as well as possible difficulties for collisionless N-body simulations to reproduce observational data, provide motivation to consider stronger baryon-DM interactions. 
  In work in collaboration with Kfir Blum and Marc Kamionkowski (published in PRD), I derived the 
  strongest constraints to date on elastic scattering between baryons and DM for a wide range of 
  velocity-dependent cross sections, using measurements of the CMB fluctuations by the Planck 
  satellite and Lyman-alpha flux power spectrum measurements from the Sloan Digital Sky Survey 
  (SDSS). These constraints imply, model-independently, that a baryon in the halo of a galaxy like 
  our Milky Way cannot scatter from DM particles in the history of the universe.
  Recently, the EDGES collaboration has claimed a detection of neutral hydrogen in the early 
  universe. Their measurement, taken at face value, disagrees with the standard prediction of the 
  temperature of hydrogen gas, being colder than expected. Our work inspired a paper published in 
  Nature as a potential explanation to the claimed observation, and a number of other papers.

Probing sub-GeV DM with cosmology (complementary to direct detection searches): in recent work with my group (published in PRD), I showed that cosmology is complementary to dark matter direct detection experiments for DM masses below a GeV. We analyzed CMB data from Planck and Lyman-alpha forest data from the SDSS in the context of the sub-GeV DM scenario. Our analysis is particularly interesting given that lighter DM masses have remained unexplored by current direct detection experiments. The sub-GeV DM scenario scenario is now being considered as one of the main drivers of the Dark Matter science case for the proposed CMB-S4 experiment.
  Our paper captured attention from the community since our constraints rule out the possibility of 
  DM-baryon scattering explaining the EDGES claimed detection of neutral hydrogen.
  Furthermore, these scenarios are being proposed as the main driver of the dark matter 
  science case for the next-generation CMB experiment, CMB-S4.

Scale-dependent galaxy bias induced by light relics: with my postdoc Julian Muñoz, I  computed the scale-dependent galaxy bias induced by light relics (not limited to neutrinos) of 
  different masses, spins, and temperatures. We also made publicly available a code (“RelicFast")  
  that efficiently computes the galaxy bias in under a second, allowing for this effect to be properly 
  included in likelihood analyses with different cosmologies with light relics, at little computational 

I led the Neutrino Mass from Cosmology paper submitted to the US Decadal Survey, where I 
  argued that our understanding of the clustering of matter in the presence of massive neutrinos has 
  significantly improved over the past decade, yielding cosmological constraints that are tighter than 
  any laboratory experiment, and which will improve significantly over the next decade, resulting in a 
  guaranteed detection of the absolute neutrino mass scale.


Capturing non-Gaussianity from the large-scale structure with weighted skew-spectra: To date, most of the cosmological information from the large-scale structure of the universe has been extracted from the 2-point clustering statistics. It is well known that there is a wealth of information in higher-order statistics, but extracting this information is more challenging than it is from the power spectrum, due to its significant computational cost. In work with my group, I used cross-spectra of the galaxy density field and weighted quadratic fields (“weighted skew-spectra”) as an estimator for the galaxy bispectrum, and showed that the skew-spectra statistics can recover the predictions from the bispectrum (both the primordial one and from gravitational evolution). Computationally, evaluation of the skew-spectra is equivalent to the power spectrum estimation: it can be computed with O(N logN) operations, where N is the number of modes, as opposed to O(N2) operations, which are typically required for the bispectrum, making this estimator significantly more efficient. 

Imprints of massive spinning particles in the large-scale structure of the universe: In work with my group (published in JCAP), I presented a theoretical template for the bispectrum generated by massive spinning particles in the early universe, valid for a general triangle configuration of momenta, when the approximate conformal symmetry of the inflationary background is broken.
  We investigated the prospects of measuring these signals with upcoming galaxy surveys, and our 
  results suggest that two next-generation spectroscopic galaxy surveys, DESI and EUCLID, could    
  be sensitive to probing the effect of massive particles with non-zero spin.

Gravitational waves (likelihood analysis of BICEP/Planck data): In 2015, I joined the joint analysis between BICEP2, the Keck array, and Planck collaborations. I worked on the likelihood analysis of a multi-component model that included galactic foregrounds and a possible contribution from inflationary gravity waves. The code that I wrote was made publicly, and it has been extensively used by the community. We reported no statistically significant evidence for primordial gravitational waves and a strong evidence for galactic dust (published in PRL). My code was subsequently used in the subsequent BICEP/Keck collaboration papers.

Formalism for model-independent tests of slow-roll inflation: in a series of papers in collaboration with Wayne Hu (“Generalized slow roll approximation for large power spectrum features” (PRD), “CMB Constraints on Principal Components of the Inflation Potential” (PRD) and “Complete WMAP Constraints on Bandlimited Inflationary Features” (PRD)),  I developed a formalism, known as “Generalized Slow Roll", to test the hypotheses of slow-roll and single-field inflation in a general and model-independent way. 
  This framework was used to map constraints from the CMB onto constraints on the shape of the 
  inflationary potential beyond any specific model of inflation. It has been used and extended by 
  many groups. 
  The CMB-S4 experiment will use this formalism as its main way of probing features in the 
  inflationary potential. 

Formalism for the bispectrum of inflationary models with features: I extended, together with collaborators, the “Generalized Slow Roll” formalism to the bispectrum in a series of papers (Fast Computation (PRD) and Non-Gaussianity from step features (PRD)).
  The Planck collaboration used our formalism to look for inflationary features and, more generally, it 
  has been widely implemented to study different inflationary scenarios in the literature.

Fundamental physics from large-scale CMB E-modes: I proposed for the first time that CMB 
  polarization data from the Planck satellite has the statistical power to either confirm or rule out 
  models that attempt to explain large-scale temperature anomalies.
  I also showed that the large-scale CMB polarization signal from reionization can be a source of 
  confusion with inflationary features. 
  These consistency checks were carried out by the Planck collaboration and other groups.

Epoch of Reionization:

  Since the beginning of my career, I keep an interest in the period of reionization.

High-redshift ionization preferred by Planck data: the usual imposition of a steplike ionization 
  history requires the optical depth to reionization to mainly come from low redshifts. Together with 
  my graduate student Georges Obied and our collaborators, we relaxed this assumption and found 
  that in the Planck 2015 data, there is a preference for a component of high redshift (z>10) 
  ionization (early stars), in contradiction with claims made by the Planck collaboration. We found 
  that marginalizing inflationary freedom does not weaken the preference for z>10 ionization.
  These findings prompted the Planck collaboration to revise their standard way of analyzing the 
   reionization history and opened up an ongoing debate in the community.

New CMB B-mode contribution from patchy reionization: I showed that existing calculations of 
  the B-mode polarization power spectrum from reionization were incomplete by finding an additional 
  source of B-modes. These B-modes have been sought for in simulations by many groups.

Statistical technique for extracting the patchy reionization signal from CMB measurements: I developed a new statistical technique for extracting the inhomogeneous reionization signal from measurements of the CMB polarization. In this method, a quadratic combination of the E-mode and B-mode polarization fields is used to reconstruct a map of fluctuations in the CMB optical depth. This statistical technique has been widely used by the community, and it is one of the main ways in which CMB-S4 is planning to extract the inhomogeneous reionization signal from measurements of the CMB E-mode and B-mode polarization fields.
  I showed that the cross-correlation of this optical depth estimator with the 21-cm field is sensitive to 
  the detailed physics of reionization, and can be measured with upcoming radio interferometers and 
  CMB experiments (published in ApJ).