Spaces:
Sleeping
Sleeping
MilesCranmer
commited on
Commit
•
6f11ae4
1
Parent(s):
a4eb420
Update docstring on README
Browse files
README.md
CHANGED
@@ -196,17 +196,24 @@ which is `hall_of_fame.csv` by default. It also prints the
|
|
196 |
equations to stdout.
|
197 |
|
198 |
```python
|
199 |
-
pysr(X=None, y=None, weights=None, procs=4, niterations=100, ncyclesperiteration=300, binary_operators=["plus", "mult"], unary_operators=["cos", "exp", "sin"], alpha=0.1, annealing=True, fractionReplaced=0.10, fractionReplacedHof=0.10, npop=1000, parsimony=1e-4, migration=True, hofMigration=True, shouldOptimizeConstants=True, topn=10, weightAddNode=1, weightInsertNode=3, weightDeleteNode=3, weightDoNothing=1, weightMutateConstant=10, weightMutateOperator=1, weightRandomize=1, weightSimplify=0.01, perturbationFactor=1.0, nrestarts=3, timeout=None, equation_file='hall_of_fame.csv', test='simple1', verbosity=1e9, maxsize=20)
|
200 |
```
|
201 |
|
202 |
Run symbolic regression to fit f(X[i, :]) ~ y[i] for all i.
|
|
|
|
|
|
|
203 |
|
204 |
**Arguments**:
|
205 |
|
206 |
-
- `X`: np.ndarray, 2D array. Rows are examples,
|
|
|
|
|
207 |
- `y`: np.ndarray, 1D array. Rows are examples.
|
208 |
-
- `weights`: np.ndarray, 1D array.
|
209 |
-
-
|
|
|
|
|
210 |
- `niterations`: int, Number of iterations of the algorithm to run. The best
|
211 |
equations are printed, and migrate between populations, at the
|
212 |
end of each.
|
@@ -248,6 +255,18 @@ constant parts by evaluation
|
|
248 |
- `equation_file`: str, Where to save the files (.csv separated by |)
|
249 |
- `test`: str, What test to run, if X,y not passed.
|
250 |
- `maxsize`: int, Max size of an equation.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
251 |
|
252 |
**Returns**:
|
253 |
|
|
|
196 |
equations to stdout.
|
197 |
|
198 |
```python
|
199 |
+
pysr(X=None, y=None, weights=None, procs=4, populations=None, niterations=100, ncyclesperiteration=300, binary_operators=["plus", "mult"], unary_operators=["cos", "exp", "sin"], alpha=0.1, annealing=True, fractionReplaced=0.10, fractionReplacedHof=0.10, npop=1000, parsimony=1e-4, migration=True, hofMigration=True, shouldOptimizeConstants=True, topn=10, weightAddNode=1, weightInsertNode=3, weightDeleteNode=3, weightDoNothing=1, weightMutateConstant=10, weightMutateOperator=1, weightRandomize=1, weightSimplify=0.01, perturbationFactor=1.0, nrestarts=3, timeout=None, extra_sympy_mappings={}, equation_file='hall_of_fame.csv', test='simple1', verbosity=1e9, maxsize=20, fast_cycle=False, maxdepth=None, variable_names=[], select_k_features=None, threads=None, julia_optimization=3)
|
200 |
```
|
201 |
|
202 |
Run symbolic regression to fit f(X[i, :]) ~ y[i] for all i.
|
203 |
+
Note: most default parameters have been tuned over several example
|
204 |
+
equations, but you should adjust `threads`, `niterations`,
|
205 |
+
`binary_operators`, `unary_operators` to your requirements.
|
206 |
|
207 |
**Arguments**:
|
208 |
|
209 |
+
- `X`: np.ndarray or pandas.DataFrame, 2D array. Rows are examples,
|
210 |
+
columns are features. If pandas DataFrame, the columns are used
|
211 |
+
for variable names (so make sure they don't contain spaces).
|
212 |
- `y`: np.ndarray, 1D array. Rows are examples.
|
213 |
+
- `weights`: np.ndarray, 1D array. Each row is how to weight the
|
214 |
+
mean-square-error loss on weights.
|
215 |
+
- `procs`: int, Number of processes (=number of populations running).
|
216 |
+
- `populations`: int, Number of populations running; by default=procs.
|
217 |
- `niterations`: int, Number of iterations of the algorithm to run. The best
|
218 |
equations are printed, and migrate between populations, at the
|
219 |
end of each.
|
|
|
255 |
- `equation_file`: str, Where to save the files (.csv separated by |)
|
256 |
- `test`: str, What test to run, if X,y not passed.
|
257 |
- `maxsize`: int, Max size of an equation.
|
258 |
+
- `maxdepth`: int, Max depth of an equation. You can use both maxsize and maxdepth.
|
259 |
+
maxdepth is by default set to = maxsize, which means that it is redundant.
|
260 |
+
- `fast_cycle`: bool, (experimental) - batch over population subsamples. This
|
261 |
+
is a slightly different algorithm than regularized evolution, but does cycles
|
262 |
+
15% faster. May be algorithmically less efficient.
|
263 |
+
- `variable_names`: list, a list of names for the variables, other
|
264 |
+
than "x0", "x1", etc.
|
265 |
+
- `select_k_features`: (None, int), whether to run feature selection in
|
266 |
+
Python using random forests, before passing to the symbolic regression
|
267 |
+
code. None means no feature selection; an int means select that many
|
268 |
+
features.
|
269 |
+
- `julia_optimization`: int, Optimization level (0, 1, 2, 3)
|
270 |
|
271 |
**Returns**:
|
272 |
|