--- tags: - generated_from_trainer model-index: - name: ROBERTA_SMILES_LARGE results: [] widget: - text: 1CC[C@]23[C@@H]4[C@H]1CC5=C2C(=C(C=C5)O)O[C@H]3[C@H](C=C4)O pipeline_tag: fill-mask --- # BERT_SMILES_LARGE This model is a 83.5M parameter ROBERTA model fine tuned on a dataset of 1.1M SMILES (Simplified molecular-input line-entry system) for masked language modeling (MLM). This model builds on BERT_SMILES which was fine tuned on only 50k SMILES. If you find this model useful, I would really appreciate you giving it a like! Evaluation Loss: 0.482 Example: Morphine ``` CN1CC[C@]23[C@@H]4[C@H]1CC5=C2C(=C(C=C5)O)O[C@H]3[C@H](C=C4)O ``` ## Intended uses & limitations This model can now be used to predict physical or chemical properties with further training. ### Framework versions - Transformers 4.37.0.dev0 - Pytorch 2.1.0+cu121 - Tokenizers 0.15.0