Spaces:

andreamalhera
/

igedi

Sleeping

App Files Files Community

Andrea Maldonado commited on Aug 20, 2024

Commit

85c7f4a

1 Parent(s): 4df7226

Updates run commands

Browse files

Files changed (6) hide show

.github/workflows/test_gedi.yml +6 -6
README.md +8 -8
config_files/options/baseline.json +0 -9
config_files/options/run_params.json +0 -9
execute_grid_experiments.py +1 -1
utils/config_fabric.py +1 -2

.github/workflows/test_gedi.yml CHANGED Viewed

@@ -29,7 +29,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/feature_extraction.json
     - name: Compare output
       run: diff data/validation/test_feat.csv data/test_feat.csv
@@ -59,7 +59,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/generation.json
     - name: Compare output 1
       run:
@@ -97,7 +97,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/benchmark.json
     - name: Convert output and validation to same encoding
       run: iconv -f UTF-8 -t ASCII output/benchmark/test_benchmark.csv > data/validation/test_benchmark.csv
@@ -126,7 +126,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/augmentation.json
   test_evaluation-plotter:
     runs-on: ubuntu-latest
@@ -153,7 +153,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/evaluation_plotter.json
   test_integration:
     runs-on: ubuntu-latest
@@ -179,7 +179,7 @@ jobs:
     - name: Run test
       run:
-        python main.py -o config_files/options/baseline.json -a config_files/algorithm/experiment_test.json
   test_grid_experiments_script:
     runs-on: ubuntu-latest

     - name: Run test
       run:
+        python main.py -a config_files/algorithm/pipeline_steps/feature_extraction.json
     - name: Compare output
       run: diff data/validation/test_feat.csv data/test_feat.csv
     - name: Run test
       run:
+        python main.py -a config_files/algorithm/pipeline_steps/generation.json
     - name: Compare output 1
       run:
     - name: Run test
       run:
+        python main.py -a config_files/algorithm/pipeline_steps/benchmark.json
     - name: Convert output and validation to same encoding
       run: iconv -f UTF-8 -t ASCII output/benchmark/test_benchmark.csv > data/validation/test_benchmark.csv
     - name: Run test
       run:
+        python main.py -a config_files/algorithm/pipeline_steps/augmentation.json
   test_evaluation-plotter:
     runs-on: ubuntu-latest
     - name: Run test
       run:
+        python main.py -a config_files/algorithm/pipeline_steps/evaluation_plotter.json
   test_integration:
     runs-on: ubuntu-latest
     - name: Run test
       run:
+        python main.py -a config_files/algorithm/experiment_test.json
   test_grid_experiments_script:
     runs-on: ubuntu-latest

README.md CHANGED Viewed

@@ -27,7 +27,7 @@ conda install pyrfr swig
 ### Startup
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/experiment_test.json
 ```
 The last step should take only a few minutes to run.
@@ -41,7 +41,7 @@ Our pipeline offers several pipeline steps, which can be run sequentially or par
 To run different steps of the GEDI pipeline, please adapt the `.json` accordingly.
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/<pipeline-step>.json
 ```
 For reference of possible keys and values for each step, please see `config_files/algorithm/experiment_test.json`.
 To run the whole pipeline please create a new `.json` file, specifying all steps you want to run and specify desired keys and values for each step.
@@ -52,7 +52,7 @@ To reproduce results from out paper, please refer to [Experiments](#experiments)
 To extract the features on the event-log level and use them for hyperparameter optimization, we employ the following script:
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/feature_extraction.json
 ```
 The JSON file consists of the following key-value pairs:
@@ -74,7 +74,7 @@ The command to execute the generation step is given by a exemplarily generation.
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/generation.json
 ```
 In the `generation.json`, we have the following key-value pairs:
@@ -106,7 +106,7 @@ The benchmarking defines the downstream task which is used for evaluating the go
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/benchmark.json
 ```
 In the `benchmark.json`, we have the following key-value pairs:
@@ -124,7 +124,7 @@ The purpose of the evaluation plotting step is used just for visualization. Some
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/pipeline_steps/evaluation_plotter.json
 ```
 Generally, in the `evaluation_plotter.json`, we have the following key-value pairs:
@@ -144,7 +144,7 @@ To execute the experiments with real targets, we employ the [experiment_real_tar
 ```console
 conda activate gedi
-python main.py -o config_files/options/baseline.json -a config_files/algorithm/experiment_real_targets.json
 ```
 ### Generating data with grid targets
@@ -167,7 +167,7 @@ open "http://localhost:9000/"
 To run the visualizations, we employ [jupyter notebooks](https://jupyter.org/install) and [add the installed environment to the jupyter notebook](https://medium.com/@nrk25693/how-to-add-your-conda-environment-to-your-jupyter-notebook-in-just-4-steps-abeab8b8d084). We then start all visualizations by running e.g.: `jupyter noteboook`. In the following, we describe the `.ipynb`-files in the folder `\notebooks` to reproduce the figures from our paper.
 #### [Fig. 4 and fig. 5 Representativeness](notebooks/gedi_figs4and5_representativeness.ipynb)
-To visualize the coverage of the feasible feature space of generated event logs compared to existing real-world benchmark datasets, in this notebook, we conduct a principal component analysis on the features of both settings. The first two principal components are utilized to visualize the coverage which is further highlighted by computing a convex hull of the 2D mapping.  Additionally, we visualize the distribution of each meta feature we used in the paper as a boxplot. Additional features can be extracted with FEEED. Therefore, the notebook contains the figures 4 and 5 in the paper.
 #### [Fig. 6 Benchmark Boxplots](notebooks/gedi_fig6_benchmark_boxplots.ipynb)
 This notebook is used to visualize the metric distribution of real event logs compared to the generated ones. It shows 5 different metrics on 3 various process discovery techniques. We use 'fitness,', 'precision', 'fscore', 'size', 'cfc' (control-flow complexity) as metrics and as 'heuristic miner', 'ilp' (integer linear programming), and 'imf' (inductive miner infrequent) as miners. The notebook outputs the visualization shown in Fig.6 in the paper.

 ### Startup
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/experiment_test.json
 ```
 The last step should take only a few minutes to run.
 To run different steps of the GEDI pipeline, please adapt the `.json` accordingly.
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/pipeline_steps/<pipeline-step>.json
 ```
 For reference of possible keys and values for each step, please see `config_files/algorithm/experiment_test.json`.
 To run the whole pipeline please create a new `.json` file, specifying all steps you want to run and specify desired keys and values for each step.
 To extract the features on the event-log level and use them for hyperparameter optimization, we employ the following script:
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/pipeline_steps/feature_extraction.json
 ```
 The JSON file consists of the following key-value pairs:
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/pipeline_steps/generation.json
 ```
 In the `generation.json`, we have the following key-value pairs:
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/pipeline_steps/benchmark.json
 ```
 In the `benchmark.json`, we have the following key-value pairs:
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/pipeline_steps/evaluation_plotter.json
 ```
 Generally, in the `evaluation_plotter.json`, we have the following key-value pairs:
 ```console
 conda activate gedi
+python main.py -a config_files/algorithm/experiment_real_targets.json
 ```
 ### Generating data with grid targets
 To run the visualizations, we employ [jupyter notebooks](https://jupyter.org/install) and [add the installed environment to the jupyter notebook](https://medium.com/@nrk25693/how-to-add-your-conda-environment-to-your-jupyter-notebook-in-just-4-steps-abeab8b8d084). We then start all visualizations by running e.g.: `jupyter noteboook`. In the following, we describe the `.ipynb`-files in the folder `\notebooks` to reproduce the figures from our paper.
 #### [Fig. 4 and fig. 5 Representativeness](notebooks/gedi_figs4and5_representativeness.ipynb)
+To visualize the coverage of the feasible feature space of generated event logs compared to existing real-world benchmark datasets, in this notebook, we conduct a principal component analysis on the features of both settings. The first two principal components are utilized to visualize the coverage which is further highlighted by computing a convex hull of the 2D mapping.Additionally, we visualize the distribution of each meta feature we used in the paper as a boxplot. Additional features can be extracted with FEEED. Therefore, the notebook contains the figures 4 and 5 in the paper.
 #### [Fig. 6 Benchmark Boxplots](notebooks/gedi_fig6_benchmark_boxplots.ipynb)
 This notebook is used to visualize the metric distribution of real event logs compared to the generated ones. It shows 5 different metrics on 3 various process discovery techniques. We use 'fitness,', 'precision', 'fscore', 'size', 'cfc' (control-flow complexity) as metrics and as 'heuristic miner', 'ilp' (integer linear programming), and 'imf' (inductive miner infrequent) as miners. The notebook outputs the visualization shown in Fig.6 in the paper.

config_files/options/baseline.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-  "run_option": "baseline",
-  "plot_type": "color_map",
-  "plot_tics": true,
-  "n_components": 2,
-  "input_name": "test",
-  "save_results": false,
-  "load_results": false
-}

config_files/options/run_params.json DELETED Viewed

@@ -1,9 +0,0 @@
-{
-  "run_option": "compare",
-  "plot_type": "color_map",
-  "plot_tics": true,
-  "n_components": 2,
-  "input_name": "gen20",
-  "save_results": false,
-  "load_results": true
-}

execute_grid_experiments.py CHANGED Viewed

@@ -10,7 +10,7 @@ from tqdm import tqdm
 def multi_experiment_wrapper(config_file, i=0):
     print(f"=========================STARTING EXPERIMENT #{i+1}=======================")
     print(f"INFO: Executing with {config_file}")
-    os.system(f"python -W ignore main.py -o config_files/options/baseline.json -a {config_file}")
     print(f"=========================FINISHED EXPERIMENT #{i+1}=======================")
 if __name__ == '__main__':

 def multi_experiment_wrapper(config_file, i=0):
     print(f"=========================STARTING EXPERIMENT #{i+1}=======================")
     print(f"INFO: Executing with {config_file}")
+    os.system(f"python -W ignore main.py -a {config_file}")
     print(f"=========================FINISHED EXPERIMENT #{i+1}=======================")
 if __name__ == '__main__':

utils/config_fabric.py CHANGED Viewed

@@ -243,8 +243,7 @@ if __name__ == '__main__':
         st.write("Saved configuration in ", output_path, ". Run command:")
         #if create_run_button:
         if True:
-            options_path = os.path.join("config_files", "options", "baseline.json")
-            var = f"python -W ignore main.py -o {options_path} -a {output_path}"
             st.code(var, language='bash')
         if False: #FIXME: Command fails when using multiprocessing
             command = var.split()

         st.write("Saved configuration in ", output_path, ". Run command:")
         #if create_run_button:
         if True:
+            var = f"python -W ignore main.py -a {output_path}"
             st.code(var, language='bash')
         if False: #FIXME: Command fails when using multiprocessing
             command = var.split()