MilesCranmer commited on
Commit
dc2bfcc
1 Parent(s): 03b278d

Add more paper examples

Browse files
Files changed (1) hide show
  1. docs/papers.yml +67 -30
docs/papers.yml CHANGED
@@ -1,8 +1,24 @@
1
  # This file stores papers which have used PySR, with
2
  # information to generate the "Research Showcase"
3
 
4
- # The order here is in terms of date. New papers should be added at the top.
5
  papers:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  - title: Machine Learning the Gravity Equation for International Trade
7
  authors:
8
  - Sergiy Verstyuk (1)
@@ -38,25 +54,6 @@ papers:
38
  abstract: We present a simple phenomenological formula which approximates the hyperbolic volume of a knot using only a single evaluation of its Jones polynomial at a root of unity. The average error is just 2.86% on the first 1.7 million knots, which represents a large improvement over previous formulas of this kind. To find the approximation formula, we use layer-wise relevance propagation to reverse engineer a black box neural network which achieves a similar average error for the same approximation task when trained on 10% of the total dataset. The particular roots of unity which appear in our analysis cannot be written as e2πi/(k+2) with integer k; therefore, the relevant Jones polynomial evaluations are not given by unknot-normalized expectation values of Wilson loop operators in conventional SU(2) Chern-Simons theory with level k. Instead, they correspond to an analytic continuation of such expectation values to fractional level. We briefly review the continuation procedure and comment on the presence of certain Lefschetz thimbles, to which our approximation formula is sensitive, in the analytically continued Chern-Simons integration cycle.
39
  image: hyperbolic_volume.png
40
  date: 2021-06-07
41
-
42
- # Modeling the galaxy-halo connection with machine learning
43
- # Ana Maria Delgado, 1
44
- # Digvijay Wadekar, 2 3
45
- # Boryana Hadzhiyska,1
46
- # Sownak Bose,1 7
47
- # Lars Hernquist,1
48
- # Shirley Ho 2 4 5 6
49
- # 1Center for Astrophysics | Harvard & Smithsonian, 60 Garden Street, Cambridge, MA 02138, USA
50
- # 2Center for Cosmology and Particle Physics, Department of Physics, New York University, New York, NY 10003, USA
51
- # 3School of Natural Sciences, Institute for Advanced Study, Princeton, NJ 08540, USA
52
- # 4Center for Computational Astrophysics, Flatiron Institute, 162 5th Ave, New York, NY 10010, USA
53
- # 5Department of Astrophysical Sciences, Princeton University, Peyton Hall, Princeton NJ 08544-0010, USA
54
- # 6Department of Physics, Carnegie Mellon University, Pittsburgh, PA 15217, USA
55
- # 7Institute for Computational Cosmology, Department of Physics, Durham University, Durham DH1 3LE, UK
56
- # https://arxiv.org/abs/2111.02422v1
57
- # To extract information from the clustering of galaxies on non-linear scales, we need to model the connection between galaxies and halos accurately and in a flexible manner. Standard halo occupation distribution (HOD) models make the assumption that the galaxy occupation in a halo is a function of only its mass, however, in reality, the occupation can depend on various other parameters including halo concentration, assembly history, environment, spin, etc. Using the IllustrisTNG hydrodynamic simulation as our target, we show that machine learning tools can be used to capture this high-dimensional dependence and provide more accurate galaxy occupation models. Specifically, we use a random forest regressor to identify which secondary halo parameters best model the galaxy-halo connection and symbolic regression to augment the standard HOD model with simple equations capturing the dependence on those parameters, namely the local environmental overdensity and shear, at the location of a halo. This not only provides insights into the galaxy-formation relationship but, more importantly, improves the clustering statistics of the modeled galaxies significantly. Our approach demonstrates that machine learning tools can help us better understand and model the galaxy-halo connection, and are therefore useful for galaxy formation and cosmology studies from upcoming galaxy surveys.
58
- # hod_importances.png
59
- # 3 Nov 2021
60
  - title: Modeling the galaxy-halo connection with machine learning
61
  authors:
62
  - Ana Maria Delgado (1)
@@ -66,7 +63,7 @@ papers:
66
  - Lars Hernquist (1)
67
  - Shirley Ho (2,4,5,6)
68
  affiliations:
69
- 1: Center for Astrophysics | Harvard & Smithsonian
70
  2: New York University
71
  3: Institute for Advanced Study
72
  4: Flatiron Institute
@@ -77,12 +74,52 @@ papers:
77
  abstract: To extract information from the clustering of galaxies on non-linear scales, we need to model the connection between galaxies and halos accurately and in a flexible manner. Standard halo occupation distribution (HOD) models make the assumption that the galaxy occupation in a halo is a function of only its mass, however, in reality, the occupation can depend on various other parameters including halo concentration, assembly history, environment, spin, etc. Using the IllustrisTNG hydrodynamic simulation as our target, we show that machine learning tools can be used to capture this high-dimensional dependence and provide more accurate galaxy occupation models. Specifically, we use a random forest regressor to identify which secondary halo parameters best model the galaxy-halo connection and symbolic regression to augment the standard HOD model with simple equations capturing the dependence on those parameters, namely the local environmental overdensity and shear, at the location of a halo. This not only provides insights into the galaxy-formation relationship but, more importantly, improves the clustering statistics of the modeled galaxies significantly. Our approach demonstrates that machine learning tools can help us better understand and model the galaxy-halo connection, and are therefore useful for galaxy formation and cosmology studies from upcoming galaxy surveys.
78
  image: hod_importances.png
79
  date: 2021-11-03
80
-
81
-
82
-
83
-
84
-
85
-
86
- # To add:
87
- # https://arxiv.org/abs/2109.04484v1 - astrophysics paper, where they use PySR to discover a more accurate model for the properties of dark matter subhalos in an interpretable way.
88
- # https://arxiv.org/abs/2012.00111 - astrophysics paper, where they use PySR to model assembly bias, and recover a new interpretable model for doing so.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  # This file stores papers which have used PySR, with
2
  # information to generate the "Research Showcase"
3
 
 
4
  papers:
5
+ - title: Modeling assembly bias with machine learning and symbolic regression
6
+ authors:
7
+ - Digvijay Wadekar (1)
8
+ - Francisco Villaescusa-Navarro (2,3)
9
+ - Shirley Ho (2,3,4)
10
+ - Laurence Perreault-Levasseur (3,5,6)
11
+ affiliations:
12
+ 1: New York University
13
+ 2: Princeton University
14
+ 3: Flatiron Institute
15
+ 4: Carnegie Mellon University
16
+ 5: Université de Montréal
17
+ 6: Mila
18
+ link: https://arxiv.org/abs/2012.00111
19
+ abstract: "Upcoming 21cm surveys will map the spatial distribution of cosmic neutral hydrogen (HI) over unprecedented volumes. Mock catalogues are needed to fully exploit the potential of these surveys. Standard techniques employed to create these mock catalogs, like Halo Occupation Distribution (HOD), rely on assumptions such as the baryonic properties of dark matter halos only depend on their masses. In this work, we use the state-of-the-art magneto-hydrodynamic simulation IllustrisTNG to show that the HI content of halos exhibits a strong dependence on their local environment. We then use machine learning techniques to show that this effect can be 1) modeled by these algorithms and 2) parametrized in the form of novel analytic equations. We provide physical explanations for this environmental effect and show that ignoring it leads to underprediction of the real-space 21-cm power spectrum at k≳0.05 h/Mpc by ≳10%, which is larger than the expected precision from upcoming surveys on such large scales. Our methodology of combining numerical simulations with machine learning techniques is general, and opens a new direction at modeling and parametrizing the complex physics of assembly bias needed to generate accurate mocks for galaxy and line intensity mapping surveys."
20
+ image: hi_mass.png
21
+ date: 2020-11-30
22
  - title: Machine Learning the Gravity Equation for International Trade
23
  authors:
24
  - Sergiy Verstyuk (1)
 
54
  abstract: We present a simple phenomenological formula which approximates the hyperbolic volume of a knot using only a single evaluation of its Jones polynomial at a root of unity. The average error is just 2.86% on the first 1.7 million knots, which represents a large improvement over previous formulas of this kind. To find the approximation formula, we use layer-wise relevance propagation to reverse engineer a black box neural network which achieves a similar average error for the same approximation task when trained on 10% of the total dataset. The particular roots of unity which appear in our analysis cannot be written as e2πi/(k+2) with integer k; therefore, the relevant Jones polynomial evaluations are not given by unknot-normalized expectation values of Wilson loop operators in conventional SU(2) Chern-Simons theory with level k. Instead, they correspond to an analytic continuation of such expectation values to fractional level. We briefly review the continuation procedure and comment on the presence of certain Lefschetz thimbles, to which our approximation formula is sensitive, in the analytically continued Chern-Simons integration cycle.
55
  image: hyperbolic_volume.png
56
  date: 2021-06-07
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
57
  - title: Modeling the galaxy-halo connection with machine learning
58
  authors:
59
  - Ana Maria Delgado (1)
 
63
  - Lars Hernquist (1)
64
  - Shirley Ho (2,4,5,6)
65
  affiliations:
66
+ 1: "Center for Astrophysics | Harvard & Smithsonian"
67
  2: New York University
68
  3: Institute for Advanced Study
69
  4: Flatiron Institute
 
74
  abstract: To extract information from the clustering of galaxies on non-linear scales, we need to model the connection between galaxies and halos accurately and in a flexible manner. Standard halo occupation distribution (HOD) models make the assumption that the galaxy occupation in a halo is a function of only its mass, however, in reality, the occupation can depend on various other parameters including halo concentration, assembly history, environment, spin, etc. Using the IllustrisTNG hydrodynamic simulation as our target, we show that machine learning tools can be used to capture this high-dimensional dependence and provide more accurate galaxy occupation models. Specifically, we use a random forest regressor to identify which secondary halo parameters best model the galaxy-halo connection and symbolic regression to augment the standard HOD model with simple equations capturing the dependence on those parameters, namely the local environmental overdensity and shear, at the location of a halo. This not only provides insights into the galaxy-formation relationship but, more importantly, improves the clustering statistics of the modeled galaxies significantly. Our approach demonstrates that machine learning tools can help us better understand and model the galaxy-halo connection, and are therefore useful for galaxy formation and cosmology studies from upcoming galaxy surveys.
75
  image: hod_importances.png
76
  date: 2021-11-03
77
+ - title: Finding universal relations in subhalo properties with artificial intelligence
78
+ authors:
79
+ - Helen Shao (1)
80
+ - Francisco Villaescusa-Navarro (1,2)
81
+ - Shy Genel (2,3)
82
+ - David N. Spergel (2,1)
83
+ - Daniel Angles-Alcazar (4,2)
84
+ - Lars Hernquist (5)
85
+ - Romeel Dave (6,7,8)
86
+ - Desika Narayanan (9,10)
87
+ - Gabriella Contardo (2)
88
+ - Mark Vogelsberger (11)
89
+ affiliations:
90
+ 1: Princeton University
91
+ 2: Flatiron Institute
92
+ 3: Columbia University
93
+ 4: University of Connecticut
94
+ 5: "Center for Astrophysics | Harvard & Smithsonian"
95
+ 6: University of Edinburgh
96
+ 7: University of the Western Cape
97
+ 8: South African Astronomical Observatories
98
+ 9: University of Florida
99
+ 10: University of Florida Informatics Institute
100
+ 11: MIT
101
+ link: https://arxiv.org/abs/2109.04484v1
102
+ abstract: "We use a generic formalism designed to search for relations in high-dimensional spaces to determine if the total mass of a subhalo can be predicted from other internal properties such as velocity dispersion, radius, or star-formation rate. We train neural networks using data from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project and show that the model can predict the total mass of a subhalo with high accuracy: more than 99% of the subhalos have a predicted mass within 0.2 dex of their true value. The networks exhibit surprising extrapolation properties, being able to accurately predict the total mass of any type of subhalo containing any kind of galaxy at any redshift from simulations with different cosmologies, astrophysics models, subgrid physics, volumes, and resolutions, indicating that the network may have found a universal relation. We then use different methods to find equations that approximate the relation found by the networks and derive new analytic expressions that predict the total mass of a subhalo from its radius, velocity dispersion, and maximum circular velocity. We show that in some regimes, the analytic expressions are more accurate than the neural networks. We interpret the relation found by the neural network and approximated by the analytic equation as being connected to the virial theorem."
103
+ date: 2021-09-09
104
+ image: illustris_example.png
105
+ - title: Rediscovering orbital mechanics with machine learning
106
+ authors:
107
+ - Pablo Lemos (1,2)
108
+ - Niall Jeffrey (3,2)
109
+ - Miles Cranmer (4)
110
+ - Shirley Ho (4,5,6,7)
111
+ - Peter Battaglia (8)
112
+ affiliations:
113
+ 1: University of Sussex
114
+ 2: University College London
115
+ 3: ENS
116
+ 4: Princeton University
117
+ 5: Flatiron Institute
118
+ 6: Carnegie Mellon University
119
+ 7: New York University
120
+ 8: DeepMind
121
+ link: https://arxiv.org/abs/2202.02306
122
+ abstract: "We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a \"graph neural network\" to simulate the dynamics of our solar system's Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton's law of gravitation. The key assumptions that were required were translational and rotational equivariance, and Newton's second and third laws of motion. Our approach correctly discovered the form of the symbolic force law. Furthermore, our approach did not require any assumptions about the masses of planets and moons or physical constants. They, too, were accurately inferred through our methods. Though, of course, the classical law of gravitation has been known since Isaac Newton, our result serves as a validation that our method can discover unknown laws and hidden properties from observed data. More broadly this work represents a key step toward realizing the potential of machine learning for accelerating scientific discovery."
123
+ image: rediscovering_gravity.png
124
+ date: 2022-02-04
125
+ link: https://arxiv.org/abs/2202.02306