thaonguyen217 commited on
Commit
08ba819
·
1 Parent(s): ab03447

upload README.md

Browse files
Files changed (2) hide show
  1. README.md +32 -0
  2. main.png +0 -0
README.md ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Farm Molecular Representation Model
2
+
3
+ ![FARM]("./main.png")
4
+
5
+ ## Overview
6
+
7
+ The **FARM Molecular Representation Model** is designed for molecular representation tasks using a BERT-based approach. The key innovation of FARM lies in its functional group-aware tokenization, which incorporates functional group information directly into the representations. This strategic reduction in tokenization granularity, intentionally interfaced with key drivers of functional properties (i.e., functional groups), enhances the model's understanding of chemical language, expands the chemical lexicon, bridges the gap between SMILES and natural language, and ultimately advances the model's capacity to predict molecular properties. FARM also represents molecules from two perspectives: by using masked language modeling to capture atom-level features and by employing graph neural networks to encode the whole molecule topology. By leveraging contrastive learning, FARM aligns these two views of representations into a unified molecular embedding.
8
+
9
+ ## Purpose
10
+
11
+ This model aims to:
12
+ - Enhance molecular representation by directly incorporating functional group information directly into the representations.
13
+ - Facilitate tasks such as molecular prediction, classification, and generation.
14
+
15
+ ## Components
16
+
17
+ The model includes the following key files:
18
+
19
+ - **`model.safetensors`**: The main model weights.
20
+ - **`config.json`**: Contains configuration parameters for the model architecture.
21
+ - **`generation_config.json`**: Configuration for text generation settings.
22
+ - **`special_tokens_map.json`**: Mapping of special tokens used by the tokenizer.
23
+ - **`tokenizer.json`**: Tokenizer configuration file.
24
+ - **`tokenizer_config.json`**: Additional settings for the tokenizer.
25
+ - **`.gitattributes`**: Git attributes file specifying LFS for large files.
26
+
27
+ ## Installation
28
+
29
+ To use the model, you need to install the required libraries. You can do this using pip:
30
+
31
+ ```bash
32
+ pip install transformers torch
main.png ADDED