Spaces:

andreamalhera
/

igedi

Running

App Files Files Community

Andrea Maldonado commited on Aug 30, 2024

Commit

8c2c8d0

1 Parent(s): 32309bf

iGEDI release

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

README.md +168 -14
config.py +6 -69
config_files/algorithm/augmentation.json +0 -12
config_files/config_layout.json +48 -0
config_files/experiment_real_targets.json +41 -0
config_files/grid_2obj/generator_grid_2objectives_ense_enseef.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_ense_enself.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_ense_enve.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_ense_rmcv.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_ense_rt10v.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_ense_rvpnot.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enseef_enself.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enseef_enve.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enseef_rmcv.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enseef_rt10v.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enseef_rvpnot.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enself_enve.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enself_rmcv.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enself_rt10v.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enself_rvpnot.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enve_mvo.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enve_rmcv.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enve_rt10v.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enve_rvpnot.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_enve_sam.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_mvo_sam.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_rmcv_rt10v.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_rmcv_rvpnot.json +1 -0
config_files/grid_2obj/generator_grid_2objectives_rt10v_rvpnot.json +1 -0
config_files/options/baseline.json +0 -9
config_files/options/run_params.json +0 -9
config_files/pipeline_steps/augmentation.json +12 -0
config_files/{algorithm → pipeline_steps}/benchmark.json +1 -1
config_files/{algorithm → pipeline_steps}/evaluation_plotter.json +2 -2
config_files/{algorithm → pipeline_steps}/feature_extraction.json +1 -1
config_files/{algorithm → pipeline_steps}/generation.json +0 -0
config_files/{algorithm → test}/experiment_test.json +3 -3
config_files/{algorithm/test → test}/generator_2bpic_2objectives_ense_enseef.json +2 -2
config_files/{algorithm/test → test}/generator_grid_1objectives_rt10v.json +2 -2
config_files/{algorithm/test → test}/generator_grid_2objectives_ense_enself.json +2 -2
data/GenED_bench.csv +0 -0
data/GenED_feat.csv +0 -0
data/grid_1obj/grid_1objectives_ense.csv +0 -12
data/grid_1obj/grid_1objectives_enseef.csv +0 -12
data/grid_1obj/grid_1objectives_enself.csv +0 -12
data/grid_1obj/grid_1objectives_enve.csv +0 -12
data/grid_1obj/grid_1objectives_rmcv.csv +0 -12
data/grid_1obj/grid_1objectives_rt10v.csv +0 -12
data/grid_1obj/grid_1objectives_rutpt.csv +0 -12
data/grid_2obj/{grid_2objectives_enve_rutpt.csv → grid_2objectives_ense_rvpnot.csv} +1 -1

README.md CHANGED Viewed

@@ -1,13 +1,35 @@
-# GEDI
 **G**enerating **E**vent **D**ata with **I**ntentional Features for Benchmarking Process Mining
 ## Table of Contents
 - [Requirements](#requirements)
 - [Installation](#installation)
-- [Usage](#usage)
 - [References](#references)
 ## Requirements
 - [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
 - Graphviz on your OS e.g.
@@ -22,29 +44,161 @@ conda install pyrfr swig
 ```
 ## Installation
 - `conda env create -f .conda.yml`
-- Install [Feature Extractor for Event Data (feeed)](https://github.com/lmu-dbs/feeed) in the newly installed conda environment: `pip install feeed`
 ### Startup
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/experiment_test.json
 ```
-## Usage
-Our pipeline offers several pipeline steps, which can be run sequentially or partially:
-- feature_extraction
-- generation
-- benchmark
-- evaluation_plotter
-We also include two notebooks, which output experimental results as in our paper.
 To run different steps of the GEDI pipeline, please adapt the `.json` accordingly.
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/<pipeline-step>.json
 ```
-For reference of possible keys and values for each step, please see `config_files/algorithm/experiment_test.json`.
-To run the whole pipeline please create a new `.json` file, specifying all steps you want to run and specify desired keys and values for each step.
 ## References
 The framework used by `GEDI` is taken directly from the original paper by [Maldonado](mailto:[email protected]), Frey, Tavares, Rehwald and Seidl. If you would like to discuss the paper, or corresponding research questions on benchmarking process mining tasks please email the authors.

+---
+title: Gedi
+emoji: 🌖
+colorFrom: indigo
+colorTo: blue
+sdk: streamlit
+sdk_version: 1.37.1
+app_file: utils/config_fabric.py
+pinned: false
+license: mit
+---
+<p>
+  <img src="gedi/utils/logo.png" alt="Logo" width="100" align="left" />
+  <h1 style="display: inline;">GEDI</h1>
+</p>
 **G**enerating **E**vent **D**ata with **I**ntentional Features for Benchmarking Process Mining
 ## Table of Contents
+- [Interactive Web Application](#interactive-web-application)
 - [Requirements](#requirements)
 - [Installation](#installation)
+- [General Usage](#general-usage)
+- [Experiments](#experiments)
 - [References](#references)
+## Interactive Web Application
+Our [interactive web application](https://huggingface.co/spaces/andreamalhera/gedi) (iGEDI) guides you through the specification process, runs GEDI for you. You can directly download the resulting generated logs or the configuration file to run GEDI locally.
+![Interface Screenshot](gedi/utils/iGEDI_interface.png)
 ## Requirements
 - [Miniconda](https://docs.conda.io/en/latest/miniconda.html)
 - Graphviz on your OS e.g.
 ```
 ## Installation
 - `conda env create -f .conda.yml`
 ### Startup
 ```console
 conda activate gedi
+python main.py -a config_files/test/experiment_test.json
 ```
+The last step should take only a few minutes to run.
+## General Usage
+Our pipeline offers several pipeline steps, which can be run sequentially or partially ordered:
+- [Feature Extraction](#feature-extraction)
+- [Generation](#generation)
+- [Benchmark](#benchmark)
+- [Evaluation Plotter](https://github.com/lmu-dbs/gedi/blob/16-documentation-update-readme/README.md#evaluation-plotting)
 To run different steps of the GEDI pipeline, please adapt the `.json` accordingly.
 ```console
 conda activate gedi
+python main.py -a config_files/pipeline_steps/<pipeline-step>.json
+```
+For reference of possible keys and values for each step, please see `config_files/test/experiment_test.json`.
+To run the whole pipeline please create a new `.json` file, specifying all steps you want to run and specify desired keys and values for each step.
+To reproduce results from our paper, please refer to [Experiments](#experiments).
+### Feature Extraction
+---
+To extract the features on the event-log level and use them for hyperparameter optimization, we employ the following script:
+```console
+conda activate gedi
+python main.py -a config_files/pipeline_steps/feature_extraction.json
+```
+The JSON file consists of the following key-value pairs:
+- pipeline_step: denotes the current step in the pipeline (here: feature_extraction)
+- input_path: folder to the input files
+- feature params: defines a dictionary, where the inner dictionary consists of a key-value pair 'feature_set' with a list of features being extracted from the references files. A list of valid features can be looked up from the FEEED extractor
+- output_path: defines the path, where plots are saved to
+- real_eventlog_path: defines the file with the features extracted from the real event logs
+- plot_type: defines the style of the output plotting (possible values: violinplot, boxplot)
+- font_size: label font size of the output plot
+- boxplot_width: width of the violinplot/boxplot
+### Generation
+---
+After having extracted meta features from the files, the next step is to generate event log data accordingly. Generally, there are two settings on how the targets are defined: i) meta feature targets are defined by the meta features from the real event log data; ii) a configuration space is defined which resembles the feasible meta features space.
+The command to execute the generation step is given by a exemplarily generation.json file:
+```console
+conda activate gedi
+python main.py -a config_files/pipeline_steps/generation.json
+```
+In the `generation.json`, we have the following key-value pairs:
+* pipeline_step: denotes the current step in the pipeline (here: event_logs_generation)
+* output_path: defines the output folder
+* generator_params: defines the configuration of the generator itself. For the generator itself, we can set values for the general 'experiment', 'config_space', 'n_trials', and a specific 'plot_reference_feature' being used for plotting
+    - experiment: defines the path to the input file which contains the features that are used for the optimization step. The 'objectives' define the specific features, which are the optimization criteria.
+    - config_space: here, we define the configuration of the generator module (here: process tree generator). The process tree generator can process input information which defines characteristics for the generated data (a more thorough overview of the params can be found [here](https://github.com/tjouck/PTandLogGenerator):
+        - mode: most frequent number of visible activities
+        - sequence: the probability of adding a sequence operator to the tree
+        - choice: the probability of adding a choice operator to the tree
+        - parallel: the probability of adding a parallel operator to the tree
+        - loop: the probability of adding a loop operator to the tree
+        - silent: probability to add silent activity to a choice or loop operator
+        - lt_dependency: the probability of adding a random dependency to the tree
+        - num_traces: the number of traces in the event log
+        - duplicate: the probability of duplicating an activity label
+        - or: probability to add an or operator to the tree
+    - n_trials: the maximum number of trials for the hyperparameter optimization to find a feasible solution to the specific configuration being used as the target
+    - plot_reference_feature: defines the feature, which is used on the x-axis on the output plots, i.e., each feature defined in the 'objectives' of the 'experiment' is plotted against the reference feature being defined in this value
+### Benchmark
+The benchmarking defines the downstream task which is used for evaluating the goodness of the synthesized event log datasets with the metrics of real-world datasets. The command to execute a benchmarking is shown in the following script:
+```console
+conda activate gedi
+python main.py -a config_files/pipeline_steps/benchmark.json
 ```
+In the `benchmark.json`, we have the following key-value pairs:
+* pipeline_step: denotes the current step in the pipeline (here: benchmark_test)
+* benchmark_test: defines the downstream task. Currently (in v 1.0), only `discovery` for process discovery is implemented
+* input_path: defines the input folder where the synthesized event log data are stored
+* output_path: defines the output folder
+* miners: defines the miners for the downstream task 'discovery' which are used in the benchmarking. In v 1.0 the miners 'inductive' for inductive miner, 'heuristics' for heuristics miner, 'imf' for inductive miner infrequent, as well as 'ilp' for integer linear programming are implemented
+### Evaluation Plotting
+The purpose of the evaluation plotting step is used just for visualization. Some examples of how the plotter can be used is shown in the following exemplarily script:
+```console
+conda activate gedi
+python main.py -a config_files/pipeline_steps/evaluation_plotter.json
+```
+Generally, in the `evaluation_plotter.json`, we have the following key-value pairs:
+* pipeline_step: denotes the current step in the pipeline (here: evaluation_plotter)
+* input_path: defines the input file or the input folder which is considered for the visualizations. If a single file is specified, only the features in that file are considered whereas in the case of specifying a folder, the framework iterates over all files and uses them for plotting
+* plot_reference_feature: defines the feature that is used on the x-axis on the output plots, i.e., each feature defined in the input file is plotted against the reference feature being defined in this value
+* targets: defines the target values which are also used as reference. Likewise to the input_path, the targets can be specified by a single file or by a folder
+* output_path: defines where to store the plots
+## Experiments
+In this repository, experiments can be run selectively or from scratch, as preferred. For this purpose, we linked both inputs and outputs for each stage. In this section, we present the reproduction of generated event data, as in our paper, as well as the [visualization of evaluation figures](#visualizations).
+We present two settings for generating intentional event logs, using [real targets](#generating-data-with-real-targets) or using [grid targets](#generating-data-with-grid-targets). Both settings output `.xes` event logs, `.json` and `.csv` files containing feature values, as well as evaluation results, from running a [process discovery benchmark](#benchmark), for the generated event logs.
+### Generating data with real targets
+To execute the experiments with real targets, we employ the [experiment_real_targets.json](config_files/experiment_real_targets.json). The script's pipeline will output the [generated event logs (GenBaselineED)](data/event_logs/GenBaselineED), which optimize their feature values towards [real-world event data features](data/BaselineED_feat.csv), alongside their respectively measured [feature values](data/GenBaselineED_feat.csv) and [benchmark metrics values](data/GenBaselineED_bench.csv).
+```console
+conda activate gedi
+python main.py -a config_files/experiment_real_targets.json
+```
+### Generating data with grid targets
+To execute the experiments with grid targets, a single [configuration](config_files/grid_2obj) can be selected or all [grid objectives](data/grid_2obj) can be run with one command using the following script. This script will output the [generated event logs (GenED)](data/event_logs/GenED), alongside their respectively measured [feature values](data/GenED_feat.csv) and [benchmark metrics values](data/GenED_bench.csv).
+```
+conda activate gedi
+python execute_grid_experiments.py config_files/grid_2obj
+```
+We employ the [experiment_grid_2obj_configfiles_fabric.ipynb](notebooks/experiment_grid_2obj_configfiles_fabric.ipynb) to create all necessary [configuration](config_files/grid_2obj) and [objective](data/grid_2obj) files for this experiment.
+For more details about these config_files, please refer to [Feature Extraction](#feature-extraction), [Generation](#generation), and [Benchmark](#benchmark).
+To create configuration files for grid objectives interactively, you can use the start the following dashboard:
+```
+streamlit run utils/config_fabric.py # To tunnel to local machine add: --server.port 8501 --server.headless true
+# In local machine (only in case you are tunneling):
+ssh -N -f -L 9000:localhost:8501 <user@remote_machine.com>
+open "http://localhost:9000/"
+```
+### Visualizations
+To run the visualizations, we employ [jupyter notebooks](https://jupyter.org/install) and [add the installed environment to the jupyter notebook](https://medium.com/@nrk25693/how-to-add-your-conda-environment-to-your-jupyter-notebook-in-just-4-steps-abeab8b8d084). We then start all visualizations by running e.g.: `jupyter noteboook`. In the following, we describe the `.ipynb`-files in the folder `\notebooks` to reproduce the figures from our paper.
+#### [Fig. 4 and fig. 5 Representativeness](notebooks/gedi_figs4and5_representativeness.ipynb)
+To visualize the coverage of the feasible feature space of generated event logs compared to existing real-world benchmark datasets, in this notebook, we conduct a principal component analysis on the features of both settings. The first two principal components are utilized to visualize the coverage which is further highlighted by computing a convex hull of the 2D mapping.Additionally, we visualize the distribution of each meta feature we used in the paper as a boxplot. Additional features can be extracted with FEEED. Therefore, the notebook contains the figures 4 and 5 in the paper.
+#### [Fig. 6 Benchmark Boxplots](notebooks/gedi_fig6_benchmark_boxplots.ipynb)
+This notebook is used to visualize the metric distribution of real event logs compared to the generated ones. It shows 5 different metrics on 3 various process discovery techniques. We use 'fitness,', 'precision', 'fscore', 'size', 'cfc' (control-flow complexity) as metrics and as 'heuristic miner', 'ilp' (integer linear programming), and 'imf' (inductive miner infrequent) as miners. The notebook outputs the visualization shown in Fig.6 in the paper.
+#### [Fig. 7 and fig. 8 Benchmark's Statistical Tests](notebooks/gedi_figs7and8_benchmarking_statisticalTests.ipynb)
+This notebook is used to answer the question if there is a statistically significant relation between feature similarity and performance metrics for the downstream tasks of process discovery. For that, we compute the pearson coefficient, as well as the kendall's tau coefficient. This elucidates the correlation between the features with metric scores being used for process discovery. Each coefficient is calculated for three different settings: i) real-world datasets; ii) synthesized event log data with real-world targets; iii) synthesized event log data with grid objectives. Figures 7 and 8 shown in the paper refer to this notebook.
+#### [Fig. 9 Consistency and fig. 10 Limitations](notebooks/gedi_figs9and10_consistency.ipynb)
+Likewise to the evaluation on the statistical tests in notebook `gedi_figs7and8_benchmarking_statisticalTests.ipynb`, this notebook is used to compute the differences between two correlation matrices $\Delta C = C_1 - C_2$. This logic is employed to evaluate and visualize the distance of two correlation matrices. Furthermore, we show how significant scores are retained from the correlations being evaluated on real-world datasets coompared to synthesized event log datasets with real-world targets. In Fig. 9 and 10 in the paper, the results of the notebook are shown.
 ## References
 The framework used by `GEDI` is taken directly from the original paper by [Maldonado](mailto:[email protected]), Frey, Tavares, Rehwald and Seidl. If you would like to discuss the paper, or corresponding research questions on benchmarking process mining tasks please email the authors.

config.py CHANGED Viewed

@@ -1,10 +1,8 @@
 import json
-import os
 import warnings
-from gedi.utils.io_helpers import sort_files
-from tqdm import tqdm
-from utils.param_keys import INPUT_NAME, FILENAME, FOLDER_PATH, PARAMS
 def get_model_params_list(alg_json_file: str) :#-> list[dict]:
     """
@@ -20,69 +18,8 @@ def get_model_params_list(alg_json_file: str) :#-> list[dict]:
         warnings.warn('The default model parameter list is used instead of a .json-file.\n'
                       '  Use a configuration from the `config_files`-folder together with the args `-a`.')
         return [
-            {ALGORITHM_NAME: 'pca', NDIM: TENSOR_NDIM},
             ]
-def get_run_params(alg_params_json: str) -> dict:
-    """
-    Loads the running configuration given from a json file or the default dictionary from the code.
-    @param alg_params_json: str
-        Path to the json data with the running configuration
-    @return: dict
-        Running Configuration
-    """
-    if alg_params_json is not None:
-        return json.load(open(alg_params_json))
-    else:
-        warnings.warn('The default run option is used instead of a .json-file.\n'
-                      '  Use a configuration from the `config_files`-folder together with the args `-o`.')
-        return {
-            RUN_OPTION: COMPARE,
-            PLOT_TYPE: COLOR_MAP,  # 'heat_map', 'color_map', '3d_map', 'explained_var_plot'
-            PLOT_TICS: True,
-            N_COMPONENTS: 2,
-            INPUT_NAME: 'runningExample',
-            SAVE_RESULTS: True,
-            LOAD_RESULTS: True
-        }
-def get_files_and_kwargs(params: dict):
-    """
-    This method returns the filename list of the trajectory and generates the kwargs for the DataTrajectory.
-    The method is individually created for the available data set.
-    Add new trajectory options, if different data set are used.
-    @param params: dict
-        running configuration
-    @return: tuple
-        list of filenames of the trajectories AND
-        kwargs with the important arguments for the classes
-    """
-    try:
-        input_name = params[INPUT_NAME]
-    except KeyError as e:
-        raise KeyError(f'Run option parameter is missing the key: `{e}`. This parameter is mandatory.')
-    #TODO: generate parent directories if they don't exist
-    if input_name == 'test':
-        filename_list = list(tqdm(sort_files(os.listdir('data/test'))))
-        kwargs = {FILENAME: filename_list, FOLDER_PATH: 'data/test'}
-    elif input_name == 'realLogs':
-        filename_list = list(tqdm(sort_files(os.listdir('data/real_event_logs'))))
-        kwargs = {FILENAME: filename_list, FOLDER_PATH: 'data/real_event_logs'}
-    elif input_name == 'gen5':
-        filename_list = list(tqdm(sort_files(os.listdir('data/event_log'))))[:5]
-        kwargs = {FILENAME: filename_list, FOLDER_PATH: 'data/event_log'}
-    elif input_name == 'gen20':
-        filename_list = list(tqdm(sort_files(os.listdir('data/event_log'))))[:20]
-        kwargs = {FILENAME: filename_list, FOLDER_PATH: 'data/event_log'}
-    elif input_name == 'runningExample':
-        filename_list = ['running-example.xes']
-        kwargs = {FILENAME: filename_list[0], FOLDER_PATH: 'data/'}
-    elif input_name == 'metaFeatures':
-        filename_list = ['log_features.csv']
-        kwargs = {FILENAME: filename_list[0], FOLDER_PATH: 'results/'}
-    else:
-        raise ValueError(f'No data trajectory was found with the name `{input_name}`.')
-    #filename_list.pop(file_element)
-    kwargs[PARAMS] = params
-    return filename_list, kwargs

 import json
 import warnings
+from utils.param_keys import PIPELINE_STEP, INPUT_PATH, OUTPUT_PATH
+from utils.param_keys.features import FEATURE_SET, FEATURE_PARAMS
 def get_model_params_list(alg_json_file: str) :#-> list[dict]:
     """
         warnings.warn('The default model parameter list is used instead of a .json-file.\n'
                       '  Use a configuration from the `config_files`-folder together with the args `-a`.')
         return [
+            {PIPELINE_STEP: 'feature_extraction', INPUT_PATH: 'data/test',
+             FEATURE_PARAMS: {FEATURE_SET: ['ratio_unique_traces_per_trace',
+                                            'ratio_most_common_variant']},
+             OUTPUT_PATH: 'output/plots'}
             ]

config_files/algorithm/augmentation.json DELETED Viewed

@@ -1,12 +0,0 @@
-[
-  {
-    "pipeline_step": "instance_augmentation",
-    "augmentation_params":
-    {
-        "method":"SMOTE", "no_samples":20,
-        "feature_selection": ["n_traces", "n_unique_traces", "ratio_unique_traces_per_trace", "trace_len_min", "trace_len_max", "trace_len_mean", "trace_len_median", "trace_len_mode", "trace_len_std", "trace_len_variance", "trace_len_q1", "trace_len_q3", "trace_len_iqr", "trace_len_geometric_mean", "trace_len_geometric_std", "trace_len_harmonic_mean", "trace_len_skewness", "trace_len_kurtosis", "trace_len_coefficient_variation", "trace_len_entropy", "trace_len_hist1", "trace_len_hist2", "trace_len_hist3", "trace_len_hist4", "trace_len_hist5", "trace_len_hist6", "trace_len_hist7", "trace_len_hist8", "trace_len_hist9", "trace_len_hist10", "trace_len_skewness_hist", "trace_len_kurtosis_hist", "ratio_most_common_variant", "ratio_top_1_variants", "ratio_top_5_variants", "ratio_top_10_variants", "ratio_top_20_variants", "ratio_top_50_variants", "ratio_top_75_variants", "mean_variant_occurrence", "std_variant_occurrence", "skewness_variant_occurrence", "kurtosis_variant_occurrence", "n_unique_activities", "activities_min", "activities_max", "activities_mean", "activities_median", "activities_std", "activities_variance", "activities_q1", "activities_q3", "activities_iqr", "activities_skewness", "activities_kurtosis", "n_unique_start_activities", "start_activities_min", "start_activities_max", "start_activities_mean", "start_activities_median", "start_activities_std", "start_activities_variance", "start_activities_q1", "start_activities_q3", "start_activities_iqr", "start_activities_skewness", "start_activities_kurtosis", "n_unique_end_activities", "end_activities_min", "end_activities_max", "end_activities_mean", "end_activities_median", "end_activities_std", "end_activities_variance", "end_activities_q1", "end_activities_q3", "end_activities_iqr", "end_activities_skewness", "end_activities_kurtosis", "entropy_trace", "entropy_prefix", "entropy_global_block", "entropy_lempel_ziv", "entropy_k_block_diff_1", "entropy_k_block_diff_3", "entropy_k_block_diff_5", "entropy_k_block_ratio_1", "entropy_k_block_ratio_3", "entropy_k_block_ratio_5", "entropy_knn_3", "entropy_knn_5", "entropy_knn_7", "epa_variant_entropy", "epa_normalized_variant_entropy", "epa_sequence_entropy", "epa_normalized_sequence_entropy", "epa_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_linear_forgetting", "epa_sequence_entropy_exponential_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]
-    },
-    "input_path": "data/bpic_features.csv",
-    "output_path": "output"
-  }
-]

config_files/config_layout.json ADDED Viewed

	@@ -0,0 +1,48 @@

+[
+  {
+    "pipeline_step": "instance_augmentation",
+    "augmentation_params":{"method":"SMOTE", "no_samples":2,
+        "feature_selection": ["ratio_top_20_variants", "epa_normalized_sequence_entropy_linear_forgetting"]},
+    "input_path": "data/test/bpic_features.csv",
+    "output_path": "output"
+  },
+  {
+    "pipeline_step": "event_logs_generation",
+    "output_path": "output/features/2_bpic_features/2_ense_rmcv_feat.csv",
+    "output_path": "data/frontend/test",
+    "generator_params": {
+      "experiment": "data/grid_objectives.csv",
+      "experiment": {"input_path": "data/2_bpic_features.csv",
+        "objectives": ["ratio_top_20_variants", "epa_normalized_sequence_entropy_linear_forgetting"]},
+      "experiment": {"n_traces":832, "n_unique_traces":828, "ratio_variants_per_number_of_traces":0.99, "trace_len_min":1, "trace_len_max":132, "trace_len_mean":53.31, "trace_len_median":54, "trace_len_mode":61, "trace_len_std":19.89, "trace_len_variance":395.81, "trace_len_q1":44, "trace_len_q3":62, "trace_len_iqr":18, "trace_len_geometric_mean":48.15, "trace_len_geometric_std":1.69, "trace_len_harmonic_mean":37.58, "trace_len_skewness":0.0541, "trace_len_kurtosis":0.81, "trace_len_coefficient_variation":0.37, "trace_len_entropy":6.65, "trace_len_hist1":0.004, "trace_len_hist2":0.005, "trace_len_hist3":0.005, "trace_len_hist4":0.024, "trace_len_hist5":0.024, "trace_len_hist6":0.008, "trace_len_hist7":0.005, "trace_len_hist8":0.001, "trace_len_hist9":0.0, "trace_len_hist10":0.00, "trace_len_skewness_hist":0.05, "trace_len_kurtosis_hist":0.8, "ratio_most_common_variant":0.0, "ratio_top_1_variants":0.01, "ratio_top_5_variants":0.05, "ratio_top_10_variants":0.10, "ratio_top_20_variants":0.2, "ratio_top_50_variants":0.5, "ratio_top_75_variants":0.75, "mean_variant_occurrence":1.0, "std_variant_occurrence":0.07, "skewness_variant_occurrence":14.28, "kurtosis_variant_occurrence":202.00, "n_unique_activities":410, "activities_min":1, "activities_max":830, "activities_mean":108.18, "activities_median":12, "activities_std":187.59, "activities_variance":35189, "activities_q1":3, "activities_q3":125, "activities_iqr":122, "activities_skewness":2.13, "activities_kurtosis":3.81, "n_unique_start_activities":14, "start_activities_min":1, "start_activities_max":731, "start_activities_mean":59.43, "start_activities_median":1, "start_activities_std":186.72, "start_activities_variance":34863, "start_activities_q1":1, "start_activities_q3":8, "start_activities_iqr":7, "start_activities_skewness":3, "start_activities_kurtosis":9.0, "n_unique_end_activities":82, "end_activities_min":1, "end_activities_max":216, "end_activities_mean":10, "end_activities_median":1, "end_activities_std":35, "end_activities_variance":1247, "end_activities_q1":1, "end_activities_q3":3, "end_activities_iqr":2, "end_activities_skewness":5, "end_activities_kurtosis":26, "eventropy_trace":10, "eventropy_prefix":15, "eventropy_global_block":19, "eventropy_lempel_ziv":4, "eventropy_k_block_diff_1":7.1, "eventropy_k_block_diff_3":7.1, "eventropy_k_block_diff_5":7.1, "eventropy_k_block_ratio_1":7.1, "eventropy_k_block_ratio_3":7.1, "eventropy_k_block_ratio_5":7.1, "eventropy_knn_3":5.54, "eventropy_knn_5":5.04, "eventropy_knn_7":4.72, "epa_variant_entropy":240512, "epa_normalized_variant_entropy":0.68, "epa_sequence_entropy":285876, "epa_normalized_sequence_entropy":0.60, "epa_sequence_entropy_linear_forgetting":150546, "epa_normalized_sequence_entropy_linear_forgetting":0.32, "epa_sequence_entropy_exponential_forgetting":185312, "epa_normalized_sequence_entropy_exponential_forgetting":0.39},
+      "config_space": {
+        "mode": [5, 20],
+        "sequence": [0.01, 1],
+        "choice": [0.01, 1],
+        "parallel": [0.01, 1],
+        "loop": [0.01, 1],
+        "silent": [0.01, 1],
+        "lt_dependency": [0.01, 1],
+        "num_traces": [10, 100],
+        "duplicate": [0],
+        "or": [0]
+      },
+      "n_trials": 50
+    }
+  },
+  {
+    "pipeline_step": "feature_extraction",
+    "input_path": "data/test",
+    "feature_params": {"feature_set": ["n_traces", "n_unique_traces", "ratio_unique_traces_per_trace", "trace_len_min", "trace_len_max", "trace_len_mean", "trace_len_median", "trace_len_mode", "trace_len_std", "trace_len_variance", "trace_len_q1", "trace_len_q3", "trace_len_iqr", "trace_len_geometric_mean", "trace_len_geometric_std", "trace_len_harmonic_mean", "trace_len_skewness", "trace_len_kurtosis", "trace_len_coefficient_variation", "trace_len_entropy", "trace_len_hist1", "trace_len_hist2", "trace_len_hist3", "trace_len_hist4", "trace_len_hist5", "trace_len_hist6", "trace_len_hist7", "trace_len_hist8", "trace_len_hist9", "trace_len_hist10", "trace_len_skewness_hist", "trace_len_kurtosis_hist", "ratio_most_common_variant", "ratio_top_1_variants", "ratio_top_5_variants", "ratio_top_10_variants", "ratio_top_20_variants", "ratio_top_50_variants", "ratio_top_75_variants", "mean_variant_occurrence", "std_variant_occurrence", "skewness_variant_occurrence", "kurtosis_variant_occurrence", "n_unique_activities", "activities_min", "activities_max", "activities_mean", "activities_median", "activities_std", "activities_variance", "activities_q1", "activities_q3", "activities_iqr", "activities_skewness", "activities_kurtosis", "n_unique_start_activities", "start_activities_min", "start_activities_max", "start_activities_mean", "start_activities_median", "start_activities_std", "start_activities_variance", "start_activities_q1", "start_activities_q3", "start_activities_iqr", "start_activities_skewness", "start_activities_kurtosis", "n_unique_end_activities", "end_activities_min", "end_activities_max", "end_activities_mean", "end_activities_median", "end_activities_std", "end_activities_variance", "end_activities_q1", "end_activities_q3", "end_activities_iqr", "end_activities_skewness", "end_activities_kurtosis", "eventropy_trace", "eventropy_prefix", "eventropy_prefix_flattened", "eventropy_global_block", "eventropy_global_block_flattened", "eventropy_lempel_ziv", "eventropy_lempel_ziv_flattened", "eventropy_k_block_diff_1", "eventropy_k_block_diff_3", "eventropy_k_block_diff_5", "eventropy_k_block_ratio_1", "eventropy_k_block_ratio_3", "eventropy_k_block_ratio_5", "eventropy_knn_3", "eventropy_knn_5", "eventropy_knn_7", "epa_variant_entropy", "epa_normalized_variant_entropy", "epa_sequence_entropy", "epa_normalized_sequence_entropy", "epa_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_linear_forgetting", "epa_sequence_entropy_exponential_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]},
+    "output_path": "output/plots",
+    "real_eventlog_path": "data/BaselineED_feat.csv",
+    "plot_type": "boxplot"
+  },
+  {
+    "pipeline_step": "benchmark_test",
+    "benchmark_task": "discovery",
+    "input_path":"data/test",
+    "output_path":"output",
+    "miners" : ["inductive", "heu", "imf", "ilp"]
+  }
+]

config_files/experiment_real_targets.json ADDED Viewed

	@@ -0,0 +1,41 @@

+[
+   {
+    "pipeline_step": "event_logs_generation",
+    "output_path": "output",
+    "generator_params": {
+      "experiment": {
+        "input_path": "data/BaselineED_feat.csv",
+        "objectives":["ratio_variants_per_number_of_traces","ratio_most_common_variant","ratio_top_10_variants","epa_normalized_variant_entropy","epa_normalized_sequence_entropy","epa_normalized_sequence_entropy_linear_forgetting","epa_normalized_sequence_entropy_exponential_forgetting"]},
+        "config_space": {
+        "mode": [5, 20],
+        "sequence": [0.01, 1],
+        "choice": [0.01, 1],
+        "parallel": [0.01, 1],
+        "loop": [0.01, 1],
+        "silent": [0.01, 1],
+        "lt_dependency": [0.01, 1],
+        "num_traces": [10, 10001],
+        "duplicate": [0],
+        "or": [0]
+      },
+      "n_trials": 200,
+      "plot_reference_feature": ""
+    }
+  },
+  {
+    "pipeline_step": "feature_extraction",
+    "input_path": "output/BaselineED_feat/7_ense_enseef_enself_enve_rmcv_rt10v_rutpt/",
+    "input_path": "output/features/BaselineED_feat/7_ense_enseef_enself_enve_rmcv_rt10v_rutpt/",
+    "feature_params": {"feature_set":["ratio_variants_per_number_of_traces","ratio_most_common_variant","ratio_top_10_variants","epa_normalized_variant_entropy","epa_normalized_sequence_entropy","epa_normalized_sequence_entropy_linear_forgetting","epa_normalized_sequence_entropy_exponential_forgetting"]},
+    "output_path": "output/plots",
+    "real_eventlog_path": "data/BaselineED_feat.csv",
+    "plot_type": "boxplot"
+  },
+  {
+    "pipeline_step": "benchmark_test",
+    "benchmark_test": "discovery",
+    "input_path": "output/BaselineED_feat/7_ense_enseef_enself_enve_rmcv_rt10v_rutpt/",
+    "output_path":"output",
+    "miners" : ["heu", "imf", "ilp"]
+  }
+]

config_files/grid_2obj/generator_grid_2objectives_ense_enseef.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_enseef.csv", "objectives": ["epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_enseef/2_ense_enseef", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_enseef/2_ense_enseef", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_ense_enself.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_enself.csv", "objectives": ["epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_enself/2_ense_enself", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_enself/2_ense_enself", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_ense_enve.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_enve.csv", "objectives": ["epa_normalized_sequence_entropy", "epa_normalized_variant_entropy"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_enve/2_ense_enve", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_enve/2_ense_enve", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_ense_rmcv.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_rmcv.csv", "objectives": ["epa_normalized_sequence_entropy", "ratio_most_common_variant"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_rmcv/2_ense_rmcv", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_rmcv/2_ense_rmcv", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_ense_rt10v.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_rt10v.csv", "objectives": ["epa_normalized_sequence_entropy", "ratio_top_10_variants"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_rt10v/2_ense_rt10v", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_rt10v/2_ense_rt10v", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_ense_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_ense_rvpnot.csv", "objectives": ["epa_normalized_sequence_entropy", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_ense_rvpnot/2_ense_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_ense_rvpnot/2_ense_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enseef_enself.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enseef_enself.csv", "objectives": ["epa_normalized_sequence_entropy_exponential_forgetting", "epa_normalized_sequence_entropy_linear_forgetting"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enseef_enself/2_enseef_enself", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enseef_enself/2_enseef_enself", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enseef_enve.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enseef_enve.csv", "objectives": ["epa_normalized_sequence_entropy_exponential_forgetting", "epa_normalized_variant_entropy"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enseef_enve/2_enseef_enve", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enseef_enve/2_enseef_enve", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enseef_rmcv.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enseef_rmcv.csv", "objectives": ["epa_normalized_sequence_entropy_exponential_forgetting", "ratio_most_common_variant"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enseef_rmcv/2_enseef_rmcv", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enseef_rmcv/2_enseef_rmcv", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enseef_rt10v.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enseef_rt10v.csv", "objectives": ["epa_normalized_sequence_entropy_exponential_forgetting", "ratio_top_10_variants"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enseef_rt10v/2_enseef_rt10v", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enseef_rt10v/2_enseef_rt10v", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enseef_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enseef_rvpnot.csv", "objectives": ["epa_normalized_sequence_entropy_exponential_forgetting", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enseef_rvpnot/2_enseef_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enseef_rvpnot/2_enseef_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enself_enve.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enself_enve.csv", "objectives": ["epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_variant_entropy"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enself_enve/2_enself_enve", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enself_enve/2_enself_enve", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enself_rmcv.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enself_rmcv.csv", "objectives": ["epa_normalized_sequence_entropy_linear_forgetting", "ratio_most_common_variant"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enself_rmcv/2_enself_rmcv", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enself_rmcv/2_enself_rmcv", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enself_rt10v.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enself_rt10v.csv", "objectives": ["epa_normalized_sequence_entropy_linear_forgetting", "ratio_top_10_variants"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enself_rt10v/2_enself_rt10v", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enself_rt10v/2_enself_rt10v", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enself_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enself_rvpnot.csv", "objectives": ["epa_normalized_sequence_entropy_linear_forgetting", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enself_rvpnot/2_enself_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enself_rvpnot/2_enself_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enve_mvo.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/shaining/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enve_mvo.csv", "objectives": ["epa_normalized_variant_entropy", "mean_variant_occurrence"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/shaining/grid_2obj/grid_2objectives_enve_mvo/2_enve_mvo", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/shaining/grid_2obj/grid_2objectives_enve_mvo/2_enve_mvo", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enve_rmcv.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enve_rmcv.csv", "objectives": ["epa_normalized_variant_entropy", "ratio_most_common_variant"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enve_rmcv/2_enve_rmcv", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enve_rmcv/2_enve_rmcv", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enve_rt10v.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enve_rt10v.csv", "objectives": ["epa_normalized_variant_entropy", "ratio_top_10_variants"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enve_rt10v/2_enve_rt10v", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enve_rt10v/2_enve_rt10v", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enve_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enve_rvpnot.csv", "objectives": ["epa_normalized_variant_entropy", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_enve_rvpnot/2_enve_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_enve_rvpnot/2_enve_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_enve_sam.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/shaining/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_enve_sam.csv", "objectives": ["epa_normalized_variant_entropy", "start_activities_median"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/shaining/grid_2obj/grid_2objectives_enve_sam/2_enve_sam", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/shaining/grid_2obj/grid_2objectives_enve_sam/2_enve_sam", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_mvo_sam.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/shaining/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_mvo_sam.csv", "objectives": ["mean_variant_occurrence", "start_activities_median"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/shaining/grid_2obj/grid_2objectives_mvo_sam/2_mvo_sam", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/shaining/grid_2obj/grid_2objectives_mvo_sam/2_mvo_sam", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_rmcv_rt10v.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_rmcv_rt10v.csv", "objectives": ["ratio_most_common_variant", "ratio_top_10_variants"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_rmcv_rt10v/2_rmcv_rt10v", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_rmcv_rt10v/2_rmcv_rt10v", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_rmcv_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_rmcv_rvpnot.csv", "objectives": ["ratio_most_common_variant", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_rmcv_rvpnot/2_rmcv_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_rmcv_rvpnot/2_rmcv_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/grid_2obj/generator_grid_2objectives_rt10v_rvpnot.json ADDED Viewed

	@@ -0,0 +1 @@

+ [{"pipeline_step": "event_logs_generation", "output_path": "output/generated/grid_2obj", "generator_params": {"experiment": {"input_path": "data/grid_2obj/grid_2objectives_rt10v_rvpnot.csv", "objectives": ["ratio_top_10_variants", "ratio_variants_per_number_of_traces"]}, "config_space": {"mode": [5, 20], "sequence": [0.01, 1], "choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1], "silent": [0.01, 1], "lt_dependency": [0.01, 1], "num_traces": [10, 10001], "duplicate": [0], "or": [0]}, "n_trials": 200}}, {"pipeline_step": "feature_extraction", "input_path": "output/features/generated/grid_2obj/grid_2objectives_rt10v_rvpnot/2_rt10v_rvpnot", "feature_params": {"feature_set": ["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]}, "output_path": "output/plots", "real_eventlog_path": "data/BaselineED_feat.csv", "plot_type": "boxplot"}, {"pipeline_step": "benchmark_test", "benchmark_test": "discovery", "input_path": "output/generated/grid_2obj/grid_2objectives_rt10v_rvpnot/2_rt10v_rvpnot", "output_path": "output", "miners": ["heu", "imf", "ilp"]}]

config_files/options/baseline.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-  "run_option": "baseline",
-  "plot_type": "color_map",
-  "plot_tics": true,
-  "n_components": 2,
-  "input_name": "test",
-  "save_results": false,
-  "load_results": false
-}

config_files/options/run_params.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-  "run_option": "compare",
-  "plot_type": "color_map",
-  "plot_tics": true,
-  "n_components": 2,
-  "input_name": "gen20",
-  "save_results": false,
-  "load_results": true
-}

config_files/pipeline_steps/augmentation.json ADDED Viewed

	@@ -0,0 +1,12 @@

+[
+  {
+    "pipeline_step": "instance_augmentation",
+    "augmentation_params":
+    {
+        "method":"SMOTE", "no_samples":20,
+        "feature_selection": ["n_traces", "n_unique_traces", "ratio_variants_per_number_of_traces", "trace_len_min", "trace_len_max", "trace_len_mean", "trace_len_median", "trace_len_mode", "trace_len_std", "trace_len_variance", "trace_len_q1", "trace_len_q3", "trace_len_iqr", "trace_len_geometric_mean", "trace_len_geometric_std", "trace_len_harmonic_mean", "trace_len_skewness", "trace_len_kurtosis", "trace_len_coefficient_variation", "trace_len_entropy", "trace_len_hist1", "trace_len_hist2", "trace_len_hist3", "trace_len_hist4", "trace_len_hist5", "trace_len_hist6", "trace_len_hist7", "trace_len_hist8", "trace_len_hist9", "trace_len_hist10", "trace_len_skewness_hist", "trace_len_kurtosis_hist", "ratio_most_common_variant", "ratio_top_1_variants", "ratio_top_5_variants", "ratio_top_10_variants", "ratio_top_20_variants", "ratio_top_50_variants", "ratio_top_75_variants", "mean_variant_occurrence", "std_variant_occurrence", "skewness_variant_occurrence", "kurtosis_variant_occurrence", "n_unique_activities", "activities_min", "activities_max", "activities_mean", "activities_median", "activities_std", "activities_variance", "activities_q1", "activities_q3", "activities_iqr", "activities_skewness", "activities_kurtosis", "n_unique_start_activities", "start_activities_min", "start_activities_max", "start_activities_mean", "start_activities_median", "start_activities_std", "start_activities_variance", "start_activities_q1", "start_activities_q3", "start_activities_iqr", "start_activities_skewness", "start_activities_kurtosis", "n_unique_end_activities", "end_activities_min", "end_activities_max", "end_activities_mean", "end_activities_median", "end_activities_std", "end_activities_variance", "end_activities_q1", "end_activities_q3", "end_activities_iqr", "end_activities_skewness", "end_activities_kurtosis", "entropy_trace", "entropy_prefix", "entropy_global_block", "entropy_lempel_ziv", "entropy_k_block_diff_1", "entropy_k_block_diff_3", "entropy_k_block_diff_5", "entropy_k_block_ratio_1", "entropy_k_block_ratio_3", "entropy_k_block_ratio_5", "entropy_knn_3", "entropy_knn_5", "entropy_knn_7", "epa_variant_entropy", "epa_normalized_variant_entropy", "epa_sequence_entropy", "epa_normalized_sequence_entropy", "epa_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_linear_forgetting", "epa_sequence_entropy_exponential_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]
+    },
+    "input_path": "data/test/bpic_features.csv",
+    "output_path": "output"
+  }
+]

config_files/{algorithm → pipeline_steps}/benchmark.json RENAMED Viewed

@@ -4,6 +4,6 @@
     "benchmark_test": "discovery",
     "input_path":"data/test",
     "output_path":"output",
-    "miners" : ["inductive", "heuristics", "imf", "ilp"]
   }
 ]

     "benchmark_test": "discovery",
     "input_path":"data/test",
     "output_path":"output",
+    "miners" : ["ind", "heu", "imf", "ilp"]
   }
 ]

config_files/{algorithm → pipeline_steps}/evaluation_plotter.json RENAMED Viewed

@@ -1,7 +1,7 @@
 [
     {
         "pipeline_step": "evaluation_plotter",
-        "input_path": "output/features/generated/34_bpic_features/",
         "input_path": "output/features/generated/grid_2obj/",
         "input_path": ["output/features/generated/grid_1obj/", "output/features/generated/grid_2obj/"],
         "input_path": "output/features/generated/grid_1obj/1_enve_feat.csv",
@@ -9,7 +9,7 @@
         "reference_feature": "epa_normalized_sequence_entropy",
         "reference_feature": "epa_normalized_sequence_entropy_exponential_forgetting",
         "reference_feature": "epa_normalized_variant_entropy",
-        "targets": "data/34_bpic_features.csv",
         "targets": "data/grid_experiments/grid_2obj/",
         "targets": ["data/grid_experiments/grid_1obj/", "data/grid_experiments/grid_2obj/"],
         "targets": "data/grid_experiments/grid_1obj/grid_1objectives_enve.csv",

 [
     {
         "pipeline_step": "evaluation_plotter",
+        "input_path": "output/features/generated/BaselineED_feat/",
         "input_path": "output/features/generated/grid_2obj/",
         "input_path": ["output/features/generated/grid_1obj/", "output/features/generated/grid_2obj/"],
         "input_path": "output/features/generated/grid_1obj/1_enve_feat.csv",
         "reference_feature": "epa_normalized_sequence_entropy",
         "reference_feature": "epa_normalized_sequence_entropy_exponential_forgetting",
         "reference_feature": "epa_normalized_variant_entropy",
+        "targets": "data/BaselineED_feat.csv",
         "targets": "data/grid_experiments/grid_2obj/",
         "targets": ["data/grid_experiments/grid_1obj/", "data/grid_experiments/grid_2obj/"],
         "targets": "data/grid_experiments/grid_1obj/grid_1objectives_enve.csv",

config_files/{algorithm → pipeline_steps}/feature_extraction.json RENAMED Viewed

@@ -2,7 +2,7 @@
   {
     "pipeline_step": "feature_extraction",
     "input_path": "data/test",
-    "feature_params": {"feature_set":["simple_stats", "trace_length", "trace_variant", "activities", "start_activities", "end_activities", "eventropies", "epa_based"]},
     "output_path": "output/plots",
     "real_eventlog_path": "data/BaselineED_feat.csv",
     "plot_type": "boxplot",

   {
     "pipeline_step": "feature_extraction",
     "input_path": "data/test",
+    "feature_params": {"feature_set":["ratio_variants_per_number_of_traces", "ratio_most_common_variant", "ratio_top_10_variants", "epa_normalized_variant_entropy", "epa_normalized_sequence_entropy", "epa_normalized_sequence_entropy_linear_forgetting", "epa_normalized_sequence_entropy_exponential_forgetting"]},
     "output_path": "output/plots",
     "real_eventlog_path": "data/BaselineED_feat.csv",
     "plot_type": "boxplot",

config_files/{algorithm → pipeline_steps}/generation.json RENAMED Viewed

File without changes

config_files/{algorithm → test}/experiment_test.json RENAMED Viewed

@@ -3,7 +3,7 @@
     "pipeline_step": "instance_augmentation",
     "augmentation_params":{"method":"SMOTE", "no_samples":2,
         "feature_selection": ["ratio_top_20_variants", "epa_normalized_sequence_entropy_linear_forgetting"]},
-    "input_path": "data/bpic_features.csv",
     "output_path": "output"
   },
   {
@@ -39,7 +39,7 @@
     "input_path": "data/test",
     "feature_params": {"feature_set":["trace_length"]},
     "output_path": "output/plots",
-    "real_eventlog_path": "data/bpic_features.csv",
     "plot_type": "boxplot"
   },
   {
@@ -47,6 +47,6 @@
     "benchmark_test": "discovery",
     "input_path":"data/test",
     "output_path":"output",
-    "miners" : ["inductive", "heuristics", "imf", "ilp"]
   }
 ]

     "pipeline_step": "instance_augmentation",
     "augmentation_params":{"method":"SMOTE", "no_samples":2,
         "feature_selection": ["ratio_top_20_variants", "epa_normalized_sequence_entropy_linear_forgetting"]},
+    "input_path": "data/test/bpic_features.csv",
     "output_path": "output"
   },
   {
     "input_path": "data/test",
     "feature_params": {"feature_set":["trace_length"]},
     "output_path": "output/plots",
+    "real_eventlog_path": "data/BaselineED_feat.csv",
     "plot_type": "boxplot"
   },
   {
     "benchmark_test": "discovery",
     "input_path":"data/test",
     "output_path":"output",
+    "miners" : ["inductive", "heu", "imf", "ilp"]
   }
 ]

config_files/{algorithm/test → test}/generator_2bpic_2objectives_ense_enseef.json RENAMED Viewed

@@ -1,7 +1,7 @@
 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated",
 "generator_params": {"experiment":
-	{"input_path": "data/2_bpic_features.csv",
 	"objectives": ["epa_normalized_sequence_entropy",
 	"epa_normalized_sequence_entropy_exponential_forgetting"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
@@ -12,4 +12,4 @@
 	"input_path": "output/features/generated/2_bpic_features/2_ense_enseef",
 	"feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant", "activities",
 	"start_activities", "end_activities", "eventropies", "epa_based"]}, "output_path": "output/plots",
-	"real_eventlog_path": "data/2_bpic_features.csv", "plot_type": "boxplot"}]

 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated",
 "generator_params": {"experiment":
+	{"input_path": "data/test/2_bpic_features.csv",
 	"objectives": ["epa_normalized_sequence_entropy",
 	"epa_normalized_sequence_entropy_exponential_forgetting"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
 	"input_path": "output/features/generated/2_bpic_features/2_ense_enseef",
 	"feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant", "activities",
 	"start_activities", "end_activities", "eventropies", "epa_based"]}, "output_path": "output/plots",
+	"real_eventlog_path": "data/test/2_bpic_features.csv", "plot_type": "boxplot"}]

config_files/{algorithm/test → test}/generator_grid_1objectives_rt10v.json RENAMED Viewed

@@ -1,7 +1,7 @@
 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated/grid_1obj",
 "generator_params": {"experiment":
-	{"input_path": "data/grid_experiments/grid_1objectives_rt10v.csv",
 	"objectives": ["ratio_top_10_variants"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
 	"choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1],
@@ -12,5 +12,5 @@
  "input_path": "output/features/generated/grid_1obj/grid_1objectives_rt10v/1_rt10v",
  "feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant",
  "activities", "start_activities", "end_activities", "eventropies", "epa_based"]},
- "output_path": "output/plots", "real_eventlog_path": "data/2_bpic_features.csv",
  "plot_type": "boxplot"}]

 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated/grid_1obj",
 "generator_params": {"experiment":
+	{"input_path": "data/test/grid_experiments/grid_1objectives_rt10v.csv",
 	"objectives": ["ratio_top_10_variants"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
 	"choice": [0.01, 1], "parallel": [0.01, 1], "loop": [0.01, 1],
  "input_path": "output/features/generated/grid_1obj/grid_1objectives_rt10v/1_rt10v",
  "feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant",
  "activities", "start_activities", "end_activities", "eventropies", "epa_based"]},
+ "output_path": "output/plots", "real_eventlog_path": "data/test/2_bpic_features.csv",
  "plot_type": "boxplot"}]

config_files/{algorithm/test → test}/generator_grid_2objectives_ense_enself.json RENAMED Viewed

@@ -1,7 +1,7 @@
 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated/grid_2obj",
 "generator_params": {"experiment":
-	{"input_path": "data/2_grid_test.csv",
 	"objectives": ["epa_normalized_sequence_entropy",
 		"epa_normalized_sequence_entropy_linear_forgetting"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
@@ -15,5 +15,5 @@
 	"feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant",
 		"activities", "start_activities", "end_activities", "eventropies", "epa_based"]},
 	"output_path": "output/plots",
-	"real_eventlog_path": "data/2_bpic_features.csv",
 	"plot_type": "boxplot"}]

 [{"pipeline_step": "event_logs_generation",
 "output_path": "output/generated/grid_2obj",
 "generator_params": {"experiment":
+	{"input_path": "data/test/2_grid_test.csv",
 	"objectives": ["epa_normalized_sequence_entropy",
 		"epa_normalized_sequence_entropy_linear_forgetting"]},
 	"config_space": {"mode": [5, 20], "sequence": [0.01, 1],
 	"feature_params": {"feature_set": ["simple_stats", "trace_length", "trace_variant",
 		"activities", "start_activities", "end_activities", "eventropies", "epa_based"]},
 	"output_path": "output/plots",
+	"real_eventlog_path": "data/test/2_bpic_features.csv",
 	"plot_type": "boxplot"}]

data/GenED_bench.csv CHANGED Viewed

The diff for this file is too large to render. See raw diff

data/GenED_feat.csv CHANGED Viewed

The diff for this file is too large to render. See raw diff

data/grid_1obj/grid_1objectives_ense.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,epa_normalized_sequence_entropy
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_enseef.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,epa_normalized_sequence_entropy_exponential_forgetting
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_enself.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,epa_normalized_sequence_entropy_linear_forgetting
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_enve.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,epa_normalized_variant_entropy
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_rmcv.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,ratio_most_common_variant
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_rt10v.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,ratio_top_10_variants
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_1obj/grid_1objectives_rutpt.csv DELETED Viewed

@@ -1,12 +0,0 @@
-task,ratio_unique_traces_per_trace
-task_1,0.0
-task_2,0.1
-task_3,0.2
-task_4,0.3
-task_5,0.4
-task_6,0.5
-task_7,0.6
-task_8,0.7
-task_9,0.8
-task_10,0.9
-task_11,1.0

data/grid_2obj/{grid_2objectives_enve_rutpt.csv → grid_2objectives_ense_rvpnot.csv} RENAMED Viewed

@@ -1,4 +1,4 @@
-task,epa_normalized_variant_entropy,ratio_unique_traces_per_trace
 task_1,0.0,0.0
 task_2,0.0,0.1
 task_3,0.0,0.2

+task,epa_normalized_sequence_entropy,ratio_variants_per_number_of_traces
 task_1,0.0,0.0
 task_2,0.0,0.1
 task_3,0.0,0.2