File size: 1,106 Bytes
c82c6b7
 
 
0b54afa
 
 
 
 
 
e269731
0b54afa
 
e269731
d0609d3
e269731
0b54afa
 
e269731
d0609d3
 
0b54afa
e269731
0b54afa
 
 
 
 
 
 
e269731
0b54afa
b780c75
0b54afa
 
 
 
 
 
 
 
 
 
 
 
c82c6b7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
---
license: mit
---
# Model Information 🧬

**License:** MIT

### πŸ”¬ Base Model: 
[westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)

### 🧩 Task Type: 
Protein-level regression



### πŸ“Š Dataset: 
[DATASET-CAPE-RhlA-seqlabel](https://huggingface.co/datasets/SaProtHub/DATASET-CAPE-RhlA-seqlabel)

- **protein:** Contains mutation data including the RhlA enzyme sequence and corresponding performance metrics.
- **Label:** The experimentally tested fitness score, representing the scaled mutation effect for each mutant.
- **Source:** Label derived from [CAPE](https://doi.org/10.1021/acssynbio.4c00588)

### πŸ”‘ Model Input Type:
Amino acid sequence; label in RhlA

### πŸ“ˆ Performance (the best on test set):
**Spearman's ρ:** 0.862

---

## LoRA Configuration βš™οΈ
- **r:** 8
- **LoRA dropout:** 0.1
- **LoRA alpha:** 8
- **Modules to save:** `["regression"]`

## Training Configuration πŸŽ›οΈ

- **Optimizer:**
  - **Class:** AdamW
  - **Betas:** (0.9, 0.98)
  - **Weight decay:** 0.01
- **Learning rate:** 5e-5
- **Epochs:** 5
- **Batch size:** Adaptive