Update README.md
Browse files
README.md
CHANGED
@@ -1,32 +1,44 @@
|
|
1 |
-
|
2 |
-
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
-
|
|
|
6 |
|
7 |
-
|
|
|
8 |
|
9 |
-
Label
|
10 |
|
11 |
-
|
|
|
12 |
|
13 |
-
|
|
|
14 |
|
15 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
-
|
18 |
- **r:** 8
|
19 |
-
- **
|
20 |
-
- **
|
21 |
-
- **
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
- **
|
27 |
-
- **
|
28 |
-
- **
|
29 |
-
|
30 |
-
- **
|
31 |
-
- **
|
32 |
-
- **batch size:** Adaptive
|
|
|
1 |
+
|
2 |
+
# Model Information π§¬
|
3 |
+
|
4 |
+
**License:** MIT
|
5 |
+
|
6 |
---
|
7 |
|
8 |
+
### π¬ Base Model:
|
9 |
+
[westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)
|
10 |
|
11 |
+
### 𧩠Task Type:
|
12 |
+
Protein-level regression
|
13 |
|
14 |
+
- **Label:** The experimentally tested fitness score, representing the scaled mutation effect for each mutant.
|
15 |
|
16 |
+
### π Dataset:
|
17 |
+
[DATASET-CAPE-RhlA-seqlabel](https://huggingface.co/datasets/SaProtHub/DATASET-CAPE-RhlA-seqlabel)
|
18 |
|
19 |
+
- Contains mutation data including the RhlA enzyme sequence and corresponding performance metrics.
|
20 |
+
- **Source:** Label derived from [CAPE](https://doi.org/10.1021/acssynbio.4c00588)
|
21 |
|
22 |
+
### π‘ Model Input Type:
|
23 |
+
Amino acid sequence; label in RhlA
|
24 |
+
|
25 |
+
### π Performance (the best on test set):
|
26 |
+
**Spearman's Ο:** 0.862
|
27 |
+
|
28 |
+
---
|
29 |
|
30 |
+
## LoRA Configuration βοΈ
|
31 |
- **r:** 8
|
32 |
+
- **LoRA dropout:** 0.1
|
33 |
+
- **LoRA alpha:** 8
|
34 |
+
- **Modules to save:** `["regression"]`
|
35 |
+
|
36 |
+
## Training Configuration ποΈ
|
37 |
+
|
38 |
+
- **Optimizer:**
|
39 |
+
- **Class:** AdamW
|
40 |
+
- **Betas:** (0.9, 0.98)
|
41 |
+
- **Weight decay:** 0.01
|
42 |
+
- **Learning rate:** 5e-5
|
43 |
+
- **Epochs:** 5
|
44 |
+
- **Batch size:** Adaptive
|
|