advanced_manufacturing

Sleeping

App Files Files Community

o-schilter commited on Jan 26, 2023

Commit

f746564

1 Parent(s): 343ba2f

Updated Information

Browse files

Files changed (2) hide show

model_cards/article.md +14 -14
model_cards/description.md +3 -1

model_cards/article.md CHANGED Viewed

@@ -2,11 +2,11 @@
 **Algorithm Version**: Which model version to use.
-**Target binding energy**: The desired binding energy.
-**Primer SMILES**: A SMILES string used to prime the generation.
-**Maximal sequence length**: The maximal number of SMILES tokens in the generated molecule.
 **Number of points**: Number of points to sample with the Gaussian Process.
@@ -24,31 +24,31 @@
 **Distributors**: Original authors' code integrated into GT4SD.
-**Model date**: Not yet published.
-**Model version**: Different types of models trained on NCCR data using SMILES or SELFIES, potentially also with augmentation.
 **Model type**: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
 **Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
 N.A.
-**Paper or other resource for more information**:
-TBD
 **License**: MIT
 **Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
-**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular drug discovery.
-**Primary intended uses/users**: Researchers and computational chemists using the model for model comparison or research exploration purposes.
 **Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
 **Metrics**: N.A.
-**Datasets**: Data provided through NCCR.
 **Ethical Considerations**: Unclear, please consult with original authors in case of questions.
@@ -60,9 +60,9 @@ Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi
 TBD, temporarily please cite:
 ```bib
 @article{manica2022gt4sd,
-  title={GT4SD: Generative Toolkit for Scientific Discovery},
-  author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
-  journal={arXiv preprint arXiv:2207.03928},
-  year={2022}
 }
 ```

 **Algorithm Version**: Which model version to use.
+**Target binding energy**: The desired binding energy. The optimal range determined in [literature](https://doi.org/10.1039/C8SC01949E) is between -31.1 and -23.0 kcal/mol.
+**Primer SMILES**: A SMILES string is used to prime the generation.
+**Maximal sequence length**: The maximal number of tokens in the generated molecule.
 **Number of points**: Number of points to sample with the Gaussian Process.
 **Distributors**: Original authors' code integrated into GT4SD.
+**Model date**: Not yet published. Manuscript accepted.
+**Model version**: Different types of models trained on 7054 data points are represented either as SMILES or SELFIES. Augmentation was used to broaden the scope augmentation.
 **Model type**: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
 **Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
 N.A.
+**Paper or other resources for more information**:
 **License**: MIT
 **Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
+**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular, to discover new Suzuki cross-coupling catalysts.
+**Primary intended uses/users**: Researchers and computational chemists using the model for research exploration purposes.
 **Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
 **Metrics**: N.A.
+**Datasets**: Data used for training was provided through the NCCR and can be found [here](https://doi.org/10.24435/materialscloud:2018.0014/v1) and [here](https://doi.org/10.24435/materialscloud:2019.0007/v3).
 **Ethical Considerations**: Unclear, please consult with original authors in case of questions.
 TBD, temporarily please cite:
 ```bib
 @article{manica2022gt4sd,
+ title={GT4SD: Generative Toolkit for Scientific Discovery},
+ author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
+ journal={arXiv preprint arXiv:2207.03928},
+ year={2022}
 }
 ```

model_cards/description.md CHANGED Viewed

@@ -1,6 +1,8 @@
 <img align="right" src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="120" >
-*AdvancedManufacturing* is a sequence-based molecular generator tuned to generate catalysts. The model relies on a Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
 For **examples** and **documentation** of the model parameters, please see below.
 Moreover, we provide a **model card** ([Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)) at the bottom of this page.

 <img align="right" src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="120" >
+*AdvancedManufacturing* is a sequence-based molecular generator tuned to generate catalysts for the Suzuki cross-coupling. The model relies on a Variational Autoencoder with a binding-energy predictor trained on the latent space. The framework uses Gaussian Processes for generating targeted molecules. The model was trained on 7054 Catalysts provided by
+[Meyer et al.](DOI https://doi.org/10.1039/C8SC01949E).
 For **examples** and **documentation** of the model parameters, please see below.
 Moreover, we provide a **model card** ([Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)) at the bottom of this page.