Spaces:
Sleeping
Sleeping
Commit
·
f746564
1
Parent(s):
343ba2f
Updated Information
Browse files- model_cards/article.md +14 -14
- model_cards/description.md +3 -1
model_cards/article.md
CHANGED
@@ -2,11 +2,11 @@
|
|
2 |
|
3 |
**Algorithm Version**: Which model version to use.
|
4 |
|
5 |
-
**Target binding energy**: The desired binding energy.
|
6 |
|
7 |
-
**Primer SMILES**: A SMILES string used to prime the generation.
|
8 |
|
9 |
-
**Maximal sequence length**: The maximal number of
|
10 |
|
11 |
**Number of points**: Number of points to sample with the Gaussian Process.
|
12 |
|
@@ -24,31 +24,31 @@
|
|
24 |
|
25 |
**Distributors**: Original authors' code integrated into GT4SD.
|
26 |
|
27 |
-
**Model date**: Not yet published.
|
28 |
|
29 |
-
**Model version**: Different types of models trained on
|
30 |
|
31 |
**Model type**: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
|
32 |
|
33 |
**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
|
34 |
N.A.
|
35 |
|
36 |
-
**Paper or other
|
37 |
-
|
38 |
|
39 |
**License**: MIT
|
40 |
|
41 |
**Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
|
42 |
|
43 |
-
**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular
|
44 |
|
45 |
-
**Primary intended uses/users**: Researchers and computational chemists using the model for
|
46 |
|
47 |
**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
|
48 |
|
49 |
**Metrics**: N.A.
|
50 |
|
51 |
-
**Datasets**: Data provided through NCCR.
|
52 |
|
53 |
**Ethical Considerations**: Unclear, please consult with original authors in case of questions.
|
54 |
|
@@ -60,9 +60,9 @@ Model card prototype inspired by [Mitchell et al. (2019)](https://dl.acm.org/doi
|
|
60 |
TBD, temporarily please cite:
|
61 |
```bib
|
62 |
@article{manica2022gt4sd,
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
}
|
68 |
```
|
|
|
2 |
|
3 |
**Algorithm Version**: Which model version to use.
|
4 |
|
5 |
+
**Target binding energy**: The desired binding energy. The optimal range determined in [literature](https://doi.org/10.1039/C8SC01949E) is between -31.1 and -23.0 kcal/mol.
|
6 |
|
7 |
+
**Primer SMILES**: A SMILES string is used to prime the generation.
|
8 |
|
9 |
+
**Maximal sequence length**: The maximal number of tokens in the generated molecule.
|
10 |
|
11 |
**Number of points**: Number of points to sample with the Gaussian Process.
|
12 |
|
|
|
24 |
|
25 |
**Distributors**: Original authors' code integrated into GT4SD.
|
26 |
|
27 |
+
**Model date**: Not yet published. Manuscript accepted.
|
28 |
|
29 |
+
**Model version**: Different types of models trained on 7054 data points are represented either as SMILES or SELFIES. Augmentation was used to broaden the scope augmentation.
|
30 |
|
31 |
**Model type**: A sequence-based molecular generator tuned to generate catalysts. The model relies on a recurrent Variational Autoencoder with a binding-energy predictor trained on the latent code. The framework uses Gaussian Processes for generating targeted molecules.
|
32 |
|
33 |
**Information about training algorithms, parameters, fairness constraints or other applied approaches, and features**:
|
34 |
N.A.
|
35 |
|
36 |
+
**Paper or other resources for more information**:
|
37 |
+
|
38 |
|
39 |
**License**: MIT
|
40 |
|
41 |
**Where to send questions or comments about the model**: Open an issue on [GT4SD repository](https://github.com/GT4SD/gt4sd-core).
|
42 |
|
43 |
+
**Intended Use. Use cases that were envisioned during development**: Chemical research, in particular, to discover new Suzuki cross-coupling catalysts.
|
44 |
|
45 |
+
**Primary intended uses/users**: Researchers and computational chemists using the model for research exploration purposes.
|
46 |
|
47 |
**Out-of-scope use cases**: Production-level inference, producing molecules with harmful properties.
|
48 |
|
49 |
**Metrics**: N.A.
|
50 |
|
51 |
+
**Datasets**: Data used for training was provided through the NCCR and can be found [here](https://doi.org/10.24435/materialscloud:2018.0014/v1) and [here](https://doi.org/10.24435/materialscloud:2019.0007/v3).
|
52 |
|
53 |
**Ethical Considerations**: Unclear, please consult with original authors in case of questions.
|
54 |
|
|
|
60 |
TBD, temporarily please cite:
|
61 |
```bib
|
62 |
@article{manica2022gt4sd,
|
63 |
+
title={GT4SD: Generative Toolkit for Scientific Discovery},
|
64 |
+
author={Manica, Matteo and Cadow, Joris and Christofidellis, Dimitrios and Dave, Ashish and Born, Jannis and Clarke, Dean and Teukam, Yves Gaetan Nana and Hoffman, Samuel C and Buchan, Matthew and Chenthamarakshan, Vijil and others},
|
65 |
+
journal={arXiv preprint arXiv:2207.03928},
|
66 |
+
year={2022}
|
67 |
}
|
68 |
```
|
model_cards/description.md
CHANGED
@@ -1,6 +1,8 @@
|
|
1 |
<img align="right" src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="120" >
|
2 |
|
3 |
-
*AdvancedManufacturing* is a sequence-based molecular generator tuned to generate catalysts. The model relies on a Variational Autoencoder with a binding-energy predictor trained on the latent
|
|
|
4 |
|
5 |
For **examples** and **documentation** of the model parameters, please see below.
|
6 |
Moreover, we provide a **model card** ([Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)) at the bottom of this page.
|
|
|
|
1 |
<img align="right" src="https://raw.githubusercontent.com/GT4SD/gt4sd-core/main/docs/_static/gt4sd_logo.png" alt="logo" width="120" >
|
2 |
|
3 |
+
*AdvancedManufacturing* is a sequence-based molecular generator tuned to generate catalysts for the Suzuki cross-coupling. The model relies on a Variational Autoencoder with a binding-energy predictor trained on the latent space. The framework uses Gaussian Processes for generating targeted molecules. The model was trained on 7054 Catalysts provided by
|
4 |
+
[Meyer et al.](DOI https://doi.org/10.1039/C8SC01949E).
|
5 |
|
6 |
For **examples** and **documentation** of the model parameters, please see below.
|
7 |
Moreover, we provide a **model card** ([Mitchell et al. (2019)](https://dl.acm.org/doi/abs/10.1145/3287560.3287596?casa_token=XD4eHiE2cRUAAAAA:NL11gMa1hGPOUKTAbtXnbVQBDBbjxwcjGECF_i-WC_3g1aBgU1Hbz_f2b4kI_m1in-w__1ztGeHnwHs)) at the bottom of this page.
|
8 |
+
|