Spaces:
Running
Features and Options
Some configurable features and options in PySR
which you
may find useful include:
model_selection
binary_operators
,unary_operators
niterations
ncyclesperiteration
procs
populations
weights
maxsize
,maxdepth
batching
,batchSize
variable_names
(or pandas input)- Constraining operator complexity
- LaTeX, SymPy
- Callable exports: numpy, pytorch, jax
loss
These are described below
The program will output a pandas DataFrame containing the equations
to PySRRegressor.equations
containing the loss value
and complexity.
It will also dump to a csv
at the end of every iteration,
which is hall_of_fame_{date_time}.csv
by default.
It also prints the equations to stdout.
Model selection
By default, PySRRegressor
uses model_selection='best'
which selects an equation from PySRRegressor.equations
using
a combination of accuracy and complexity.
You can also select model_selection='accuracy'
.
By printing a model (i.e., print(model)
), you can see
the equation selection with the arrow shown in the pick
column.
Operators
A list of operators can be found on the operators page. One can define custom operators in Julia by passing a string:
PySRRegressor(niterations=100,
binary_operators=["mult", "plus", "special(x, y) = x^2 + y"],
extra_sympy_mappings={'special': lambda x, y: x**2 + y},
unary_operators=["cos"])
Now, the symbolic regression code can search using this special
function
that squares its left argument and adds it to its right. Make sure
all passed functions are valid Julia code, and take one (unary)
or two (binary) float32 scalars as input, and output a float32. This means if you
write any real constants in your operator, like 2.5
, you have to write them
instead as 2.5f0
, which defines it as Float32
.
Operators are automatically vectorized.
One should also define extra_sympy_mappings
,
so that the SymPy code can understand the output equation from Julia,
when constructing a useable function. This step is optional, but
is necessary for the lambda_format
to work.
Iterations
This is the total number of generations that pysr
will run for.
I usually set this to a large number, and exit when I am satisfied
with the equations.
Cycles per iteration
Each cycle considers every 10-equation subsample (re-sampled for each individual 10,
unless fast_cycle
is set in which case the subsamples are separate groups of equations)
a single time, producing one mutated equation for each.
The parameter ncyclesperiteration
defines how many times this
occurs before the equations are compared to the hall of fame,
and new equations are migrated from the hall of fame, or from other populations.
It also controls how slowly annealing occurs. You may find that increasing
ncyclesperiteration
results in a higher cycles-per-second, as the head
worker needs to reduce and distribute new equations less often, and also increases
diversity. But at the same
time, a smaller number it might be that migrating equations from the hall of fame helps
each population stay closer to the best current equations.
Processors
One can adjust the number of workers used by Julia with the
procs
option. You should set this equal to the number of cores
you want pysr
to use.
Populations
By default, populations=20
, but you can set a different
number of populations with this option.
More populations may increase
the diversity of equations discovered, though will take longer to train.
However, it is usually more efficient to have populations>procs
,
as there are multiple populations running
on each core.
Weighted data
Here, we assign weights to each row of data using inverse uncertainty squared. We also use 10 processes instead of the usual 4, which creates more populations (one population per thread).
sigma = ...
weights = 1/sigma**2
model = PySRRegressor(procs=10)
model.fit(X, y, weights=weights)
Max size
maxsize
controls the maximum size of equation (number of operators,
constants, variables). maxdepth
is by default not used, but can be set
to control the maximum depth of an equation. These will make processing
faster, as longer equations take longer to test.
One can warm up the maxsize from a small number to encourage
PySR to start simple, by using the warmupMaxsize
argument.
This specifies that maxsize increases every warmupMaxsize
.
Batching
One can turn on mini-batching, with the batching
flag,
and control the batch size with batchSize
. This will make
evolution faster for large datasets. Equations are still evaluated
on the entire dataset at the end of each iteration to compare to the hall
of fame, but only on a random subset during mutations and annealing.
Variable Names
You can pass a list of strings naming each column of X
with
variable_names
. Alternatively, you can pass X
as a pandas dataframe
and the columns will be used as variable names. Make sure only
alphabetical characters and _
are used in these names.
Constraining operator complexity
One can limit the complexity of specific operators with the constraints
parameter.
There is a "maxsize" parameter to PySR, but there is also an operator-level
"constraints" parameter. One supplies a dict, like so:
constraints={'pow': (-1, 1), 'mult': (3, 3), 'cos': 5}
What this says is that: a power law x^y can have an expression of arbitrary (-1) complexity in the x, but only complexity 1 (e.g., a constant or variable) in the y. So (x0 + 3)^5.5 is allowed, but 5.5^(x0 + 3) is not. I find this helps a lot for getting more interpretable equations. The other terms say that each multiplication can only have sub-expressions of up to complexity 3 (e.g., 5.0 + x2) in each side, and cosine can only operate on expressions of complexity 5 (e.g., 5.0 + x2 exp(x3)).
LaTeX, SymPy
After running model.fit(...)
, you can look at
model.equations
which is a pandas dataframe.
The sympy_format
column gives sympy equations,
and the lambda_format
gives callable functions.
You can optionally pass a pandas dataframe to the callable function,
if you called .fit
on a pandas dataframe as well.
There are also some helper functions for doing this quickly.
model.latex()
will generate a TeX formatted output of your equation.model.sympy()
will return the SymPy representation.model.jax()
will return a callable JAX function combined with parameters (see below)model.pytorch()
will return a PyTorch model (see below).
Callable exports: numpy, pytorch, jax
By default, the dataframe of equations will contain columns
with the identifier lambda_format
.
These are simple functions which correspond to the equation, but executed
with numpy functions.
You can pass your X
matrix to these functions
just as you did to the model.fit
call. Thus, this allows
you to numerically evaluate the equations over different output.
Calling model.predict
will execute the lambda_format
of
the best equation, and return the result. If you selected
model_selection="best"
, this will use an equation that combines
accuracy with simplicity. For model_selection="accuracy"
, this will just
look at accuracy.
One can do the same thing for PyTorch, which uses code from sympytorch, and for JAX, which uses code from sympy2jax.
Calling model.pytorch()
will return
a PyTorch module which runs the equation, using PyTorch functions,
over X
(as a PyTorch tensor). This is differentiable, and the
parameters of this PyTorch module correspond to the learned parameters
in the equation, and are trainable.
output = model.pytorch()
output['callable'](X)
For JAX, you can equivalently set the argument output_jax_format=True
.
This will return a dictionary containing a 'callable'
(a JAX function),
and 'parameters'
(a list of parameters in the equation).
You can execute this function with:
output = model.jax()
output['callable'](X, output['parameters'])
Since the parameter list is a jax array, this therefore lets you also train the parameters within JAX (and is differentiable).
loss
The default loss is mean-square error, and weighted mean-square error.
One can pass an arbitrary Julia string to define a custom loss, using,
e.g., loss="myloss(x, y) = abs(x - y)^1.5"
. For more details,
see the
Losses
page for SymbolicRegression.jl.
Here are some additional examples:
abs(x-y) loss
pysr(..., loss="f(x, y) = abs(x - y)^1.5")
Note that the function name doesn't matter:
pysr(..., loss="loss(x, y) = abs(x * y)")
With weights:
pysr(..., weights=weights, loss="myloss(x, y, w) = w * abs(x - y)")
Weights can be used in arbitrary ways:
pysr(..., weights=weights, loss="myloss(x, y, w) = abs(x - y)^2/w^2")
Built-in loss (faster) (see losses). This one computes the L3 norm:
pysr(..., loss="LPDistLoss{3}()")
Can also uses these losses for weighted (weighted-average):
pysr(..., weights=weights, loss="LPDistLoss{3}()")