A newer version of the Streamlit SDK is available:
1.44.1
Table of Contents
- Table of Contents
- main
- PINN
- PINN.pinns
- utils
- utils.test
- utils.dataset_loader - get_dataset
- utils.ndgan
- utils.data_augmentation
- :orange[nets]
- nets.envs
- nets.dense
- nets.design - B_field_norm - PUdesign
- nets.deep_dense
- nets.opti
- nets.opti.blackbox
main
PINN
PINN.pinns
PINNd_p Objects
class PINNd_p(nn.Module)
$d \mapsto P$
forward
def forward(x)
$P,U$ input, $d$ output
Arguments:
x
type - description
Returns:
_type_
- description
PINNhd_ma Objects
class PINNhd_ma(nn.Module)
$h,d \mapsto m_a $
PINNT_ma Objects
class PINNT_ma(nn.Module)
$ m_a, U \mapsto T$
utils
utils.test
utils.dataset_loader
get_dataset
def get_dataset(raw: bool = False,
sample_size: int = 1000,
name: str = 'dataset.pkl',
source: str = 'dataset.csv',
boundary_conditions: list = None) -> _pickle
Gets augmented dataset
Arguments:
raw
bool, optional - either to use source data or augmented. Defaults to False.sample_size
int, optional - sample size. Defaults to 1000.name
str, optional - name of wanted dataset. Defaults to 'dataset.pkl'.boundary_conditions
list,optional - y1,y2,x1,x2.
Returns:
_pickle
- pickle buffer
utils.ndgan
DCGAN Objects
class DCGAN()
__init__
def __init__(latent, data)
The function takes in two arguments, the latent space dimension and the dataframe. It then sets
the latent space dimension, the dataframe, the number of inputs and outputs, and then builds the models
Arguments:
latent
: The number of dimensions in the latent spacedata
: This is the dataframe that contains the data that we want to generate
define_discriminator
def define_discriminator(inputs=8)
The discriminator is a neural network that takes in a vector of length 8 and outputs a single
value between 0 and 1
Arguments:
inputs
: number of features in the dataset, defaults to 8 (optional)
Returns:
The model is being returned.
define_generator
def define_generator(latent_dim, outputs=8)
The function takes in a latent dimension and outputs and returns a model with two hidden layers
and an output layer
Arguments:
latent_dim
: The dimension of the latent space, or the space that the generator will map tooutputs
: the number of outputs of the generator, defaults to 8 (optional)
Returns:
The model is being returned.
build_models
def build_models()
The function returns the generator and discriminator models
Returns:
The generator and discriminator models are being returned.
generate_latent_points
def generate_latent_points(latent_dim, n)
Generate random points in latent space as input for the generator
Arguments:
latent_dim
: the dimension of the latent space, which is the input to the generatorn
: number of images to generate
Returns:
A numpy array of random numbers.
generate_fake_samples
def generate_fake_samples(generator, latent_dim, n)
It generates a batch of fake samples with class labels
Arguments:
generator
: The generator model that we will trainlatent_dim
: The dimension of the latent space, e.g. 100n
: The number of samples to generate
Returns:
x is the generated images and y is the labels for the generated images.
define_gan
def define_gan(generator, discriminator)
The function takes in a generator and a discriminator, sets the discriminator to be untrainable,
and then adds the generator and discriminator to a sequential model. The sequential model is then compiled with an optimizer and a loss function.
The optimizer is adam, which is a type of gradient descent algorithm.
Loss function is binary crossentropy, which is a loss function that is used for binary classification problems.
The function then returns the GAN.
Arguments:
generator
: The generator modeldiscriminator
: The discriminator model that takes in a dataset and outputs a single value representing fake/real
Returns:
The model is being returned.
summarize_performance
def summarize_performance(epoch, generator, discriminator, latent_dim, n=200)
This function evaluates the discriminator on real and fake data, and plots the real and fake
data
Arguments:
epoch
: the number of epochs to train forgenerator
: the generator modeldiscriminator
: the discriminator modellatent_dim
: The dimension of the latent spacen
: number of samples to generate, defaults to 200 (optional)
train_gan
def train_gan(g_model,
d_model,
gan_model,
latent_dim,
num_epochs=2500,
num_eval=2500,
batch_size=2)
Arguments:
g_model
: the generator modeld_model
: The discriminator modelgan_model
: The GAN model, which is the generator model combined with the discriminator modellatent_dim
: The dimension of the latent space. This is the number of random numbers that the generator model will take as inputnum_epochs
: The number of epochs to train for, defaults to 2500 (optional)num_eval
: number of epochs to run before evaluating the model, defaults to 2500 (optional)batch_size
: The number of samples to use for each gradient update, defaults to 2 (optional)
start_training
def start_training()
The function takes the generator, discriminator, and gan models, and the latent vector as arguments, and then calls the train_gan function.
predict
def predict(n)
It takes the generator model and the latent space as input and returns a batch of fake samples
Arguments:
n
: the number of samples to generate
Returns:
the generated fake samples.
utils.data_augmentation
dataset Objects
class dataset()
Creates dataset from input source
__init__
def __init__(number_samples: int,
name: str,
source: str,
boundary_conditions: list = None)
Arguments:
number_samples
int - number of samples to be genaratedname
str - name of datasetsource
str - source fileboundary_conditions
list - y1,y2,x1,x2
generate
def generate()
The function takes in a dataframe, normalizes it, and then trains a DCGAN on it.
The DCGAN is a type of generative adversarial network (GAN) that is used to generate new data.
The DCGAN is trained on the normalized dataframe, and then the DCGAN is used to generate new data.
The new data is then concatenated with the original dataframe, and the new dataframe is saved as a pickle file.
The new dataframe is then returned.
Returns:
The dataframe is being returned.
:orange[nets]
nets.envs
SCI Objects
class SCI()
Scaled computing interface.
Arguments:
hidden_dim
int, optional - Max demension of hidden linear layer. Defaults to 200. Should be >80 in not 1d casedropout
bool, optional - LEGACY, don't use. Defaults to True.epochs
int, optional - Optionally specify epochs here, but better in train. Defaults to 10.dataset
str, optional - dataset to be selected from ./data. Defaults to 'test.pkl'. If name not exists, code will generate new dataset with upcoming parameters.sample_size
int, optional - Samples to be generated (note: BEFORE applying boundary conditions). Defaults to 1000.source
str, optional - Source from which data will be generated. Better to not change. Defaults to 'dataset.csv'.boundary_conditions
list, optional - If sepcified, whole dataset will be cut rectangulary. Input list is [ymin,ymax,xmin,xmax] type. Defaults to None.
__init__
def __init__(hidden_dim: int = 200,
dropout: bool = True,
epochs: int = 10,
dataset: str = 'test.pkl',
sample_size: int = 1000,
source: str = 'dataset.csv',
boundary_conditions: list = None,
batch_size: int = 20)
Arguments:
hidden_dim
int, optional - Max demension of hidden linear layer. Defaults to 200. Should be >80 in not 1d casedropout
bool, optional - LEGACY, don't use. Defaults to True.epochs
int, optional - Optionally specify epochs here, but better in train. Defaults to 10.dataset
str, optional - dataset to be selected from ./data. Defaults to 'test.pkl'. If name not exists, code will generate new dataset with upcoming parameters.sample_size
int, optional - Samples to be generated (note: BEFORE applying boundary conditions). Defaults to 1000.source
str, optional - Source from which data will be generated. Better to not change. Defaults to 'dataset.csv'.boundary_conditions
list, optional - If sepcified, whole dataset will be cut rectangulary. Input list is [ymin,ymax,xmin,xmax] type. Defaults to None.batch_size
int, optional - Batch size for training.
feature_gen
def feature_gen(base: bool = True,
fname: str = None,
index: int = None,
func=None) -> None
Generate new features. If base true, generates most obvious ones. You can customize this by adding new feature as name of column - fname, index of parent column, and lambda function which needs to be applied elementwise.
Arguments:
base
bool, optional - Defaults to True.fname
str, optional - Name of new column. Defaults to None.index
int, optional - Index of parent column. Defaults to None.func
type, optional - lambda function. Defaults to None.
feature_importance
def feature_importance(X: pd.DataFrame, Y: pd.Series, verbose: int = 1)
Gets feature importance by SGD regression and score selection. Default threshold is 1.25*mean input X as self.df.iloc[:,(columns of choice)] Y as self.df.iloc[:,(column of choice)]
Arguments:
X
pd.DataFrame - Builtin DataFrameY
pd.Series - Builtin Seriesverbose
int, optional - either to or to not print actual report. Defaults to 1.
Returns:
Report (str)
data_flow
def data_flow(columns_idx: tuple = (1, 3, 3, 5),
idx: tuple = None,
split_idx: int = 800) -> torch.utils.data.DataLoader
Data prep pipeline It is called automatically, don't call it in your code.
Arguments:
columns_idx
tuple, optional - Columns to be selected (sliced 1:2 3:4) for feature fitting. Defaults to (1,3,3,5).idx
tuple, optional - 2|3 indexes to be selected for feature fitting. Defaults to None. Use either idx or columns_idx (for F:R->R idx, for F:R->R2 columns_idx) split_idx (int) : Index to split for training
Returns:
torch.utils.data.DataLoader
- Torch native dataloader
init_seed
def init_seed(seed)
Initializes seed for torch - optional
train_epoch
def train_epoch(X, model, loss_function, optim)
Inner function of class - don't use.
We iterate through the data, calculate the loss, backpropagate, and update the weights
Arguments:
X
: the training datamodel
: the model we're trainingloss_function
: the loss function to useoptim
: the optimizer, which is the algorithm that will update the weights of the model
compile
def compile(columns: tuple = None,
idx: tuple = None,
optim: torch.optim = torch.optim.AdamW,
loss: nn = nn.L1Loss,
model: nn.Module = dmodel,
custom: bool = False,
lr: float = 0.0001) -> None
Builds model, loss, optimizer. Has defaults
Arguments:
columns
tuple, optional - Columns to be selected for feature fitting. Defaults to (1,3,3,5).optim
- torch Optimizer. Default AdamWloss
- torch Loss function (nn). Defaults to L1Loss
train
def train(epochs: int = 10) -> None
Train model
If sklearn instance uses .fit()
epochs (int,optional)
save
def save(name: str = 'model.pt') -> None
This function saves the model to a file
Arguments:
name
(str (optional)
): The name of the file to save the model to, defaults to model.pt
onnx_export
def onnx_export(path: str = './models/model.onnx')
We are exporting the model to the ONNX format, using the input data and the model itself
Arguments:
path
(str (optional)
): The path to save the model to, defaults to ./models/model.onnx
jit_export
def jit_export(path: str = './models/model.pt')
Exports properly defined model to jit
Arguments:
path
str, optional - path to models. Defaults to './models/model.pt'.
inference
def inference(X: tensor, model_name: str = None) -> np.ndarray
Inference of (pre-)trained model
Arguments:
X
tensor - your data in domain of train
Returns:
np.ndarray
- predictions
plot
def plot()
If the input and output dimensions are the same, plot the input and output as a scatter plot. If the input and output dimensions are different, plot the first dimension of the input and output as a scatter plot
plot3d
def plot3d(colX=0, colY=1)
Plot of inputs and predicted data in mesh format
Returns:
plotly plot
performance
def performance(c=0.4) -> dict
Automatic APE based performance if applicable, else returns nan
Arguments:
c
float, optional - ZDE mitigation constant. Defaults to 0.4.
Returns:
dict
- {'Generator_Accuracy, %':np.mean(a),'APE_abs, %':abs_ape,'Model_APE, %': ape}
performance_super
def performance_super(c=0.4,
real_data_column_index: tuple = (1, 8),
real_data_samples: int = 23,
generated_length: int = 1000) -> dict
Performance by custom parameters. APE loss
Arguments:
c
float, optional - ZDE mitigation constant. Defaults to 0.4.real_data_column_index
tuple, optional - Defaults to (1,8).real_data_samples
int, optional - Defaults to 23.generated_length
int, optional - Defaults to 1000.
Returns:
dict
- {'Generator_Accuracy, %':np.mean(a),'APE_abs, %':abs_ape,'Model_APE, %': ape}
RCI Objects
class RCI(SCI)
Real values interface, uses different types of NN, NO scaling. Parent: SCI()
data_flow
def data_flow(columns_idx: tuple = (1, 3, 3, 5),
idx: tuple = None,
split_idx: int = 800) -> torch.utils.data.DataLoader
Data prep pipeline
Arguments:
columns_idx
tuple, optional - Columns to be selected (sliced 1:2 3:4) for feature fitting. Defaults to (1,3,3,5).idx
tuple, optional - 2|3 indexes to be selected for feature fitting. Defaults to None. Use either idx or columns_idx (for F:R->R idx, for F:R->R2 columns_idx) split_idx (int) : Index to split for training
Returns:
torch.utils.data.DataLoader
- Torch native dataloader
compile
def compile(columns: tuple = None,
idx: tuple = (3, 1),
optim: torch.optim = torch.optim.AdamW,
loss: nn = nn.L1Loss,
model: nn.Module = PINNd_p,
lr: float = 0.001) -> None
Builds model, loss, optimizer. Has defaults
Arguments:
columns
tuple, optional - Columns to be selected for feature fitting. Defaults to None.idx
tuple, optional - indexes to be selected Default (3,1) optim - torch Optimizer loss - torch Loss function (nn)
plot
def plot()
Plots 2d plot of prediction vs real values
performance
def performance(c=0.4) -> dict
RCI performnace. APE errors.
Arguments:
c
float, optional - correction constant to mitigate division by 0 error. Defaults to 0.4.
Returns:
dict
- {'Generator_Accuracy, %':np.mean(a),'APE_abs, %':abs_ape,'Model_APE, %': ape}
nets.dense
Net Objects
class Net(nn.Module)
The Net class inherits from the nn.Module class, which has a number of attributes and methods (such as .parameters() and .zero_grad()) which we will be using. You can read more about the nn.Module class here
__init__
def __init__(input_dim: int = 2, hidden_dim: int = 200)
We create a neural network with two hidden layers, each with hidden_dim neurons, and a ReLU activation
function. The output layer has one neuron and no activation function
Arguments:
input_dim
(int (optional)
): The dimension of the input, defaults to 2hidden_dim
(int (optional)
): The number of neurons in the hidden layer, defaults to 200
nets.design
B_field_norm
def B_field_norm(Bmax: float, L: float, k: int = 16, plot=True) -> np.array
Returns vec B_z for MS config
Arguments:
Bmax
any - maximum B in thruster L - channel length k - magnetic field profile number
PUdesign
def PUdesign(P: float, U: float) -> pd.DataFrame
Computes design via numerical model, uses fits from PINNs
Arguments:
P
float - descriptionU
float - description
Returns:
_type_
- description
nets.deep_dense
dmodel Objects
class dmodel(nn.Module)
__init__
def __init__(in_features=1, hidden_features=200, out_features=1)
We're creating a neural network with 4 layers, each with 200 neurons. The first layer takes in the input, the second layer takes in the output of the first layer, the third layer takes in the output of the second layer, and the fourth layer takes in the output of the third layer
Arguments:
in_features
: The number of input features, defaults to 1 (optional)hidden_features
: the number of neurons in the hidden layers, defaults to 200 (optional)out_features
: The number of classes for classification (1 for regression), defaults to 1 (optional)
nets.opti
nets.opti.blackbox
Hyper Objects
class Hyper(SCI)
Hyper parameter tunning class. Allows to generate best NN architecture for task. Inputs are column indexes. idx[-1] is targeted value. Based on OPTUNA algorithms it is very fast and reliable. Outputs are NN parameters in json. Optionally full report for every trial is available at the neptune.ai
__init__
def __init__(idx: tuple = (1, 3, 7), *args, **kwargs)
The function init() is a constructor that initializes the class Hyper
Arguments:
idx
(tuple
): tuple of integers, the indices of the data to be loaded
define_model
def define_model(trial)
We define a function that takes in a trial object and returns a neural network with the number
of layers, hidden units and activation functions defined by the trial object.
Arguments:
trial
: This is an object that contains the information about the current trial
Returns:
A sequential model with the number of layers, hidden units and activation functions defined by the trial.
objective
def objective(trial)
We define a model, an optimizer, and a loss function. We then train the model for a number of
epochs, and report the loss at the end of each epoch
"optimizer": ["Adam", "RMSprop", "SGD" 'AdamW','Adamax','Adagrad'] "lr" $\in$ [1e-7,1e-3], log=True
Arguments:
trial
: The trial object that is passed to the objective function
Returns:
The accuracy of the model.
start_study
def start_study(n_trials: int = 100,
neptune_project: str = None,
neptune_api: str = None)
It takes a number of trials, a neptune project name and a neptune api token as input and runs
the objective function on the number of trials specified. If the neptune project and api token are provided, it logs the results to neptune
Arguments:
n_trials
(int (optional)
): The number of trials to run, defaults to 100neptune_project
(str
): the name of the neptune project you want to log toneptune_api
(str
): your neptune api key