File size: 1,106 Bytes

c82c6b7
 
 
0b54afa
 
 
 
 
 
e269731
0b54afa
 
e269731
d0609d3
e269731
0b54afa
 
e269731
d0609d3
 
0b54afa
e269731
0b54afa
 
 
 
 
 
 
e269731
0b54afa
b780c75
0b54afa
 
 
 
 
 
 
 
 
 
 
 
c82c6b7

---
license: mit
---
# Model Information 🧬

**License:** MIT

### 🔬 Base Model: 
[westlake-repl/SaProt_35M_AF2](https://huggingface.co/westlake-repl/SaProt_35M_AF2)

### 🧩 Task Type: 
Protein-level regression



### 📊 Dataset: 
[DATASET-CAPE-RhlA-seqlabel](https://huggingface.co/datasets/SaProtHub/DATASET-CAPE-RhlA-seqlabel)

- **protein:** Contains mutation data including the RhlA enzyme sequence and corresponding performance metrics.
- **Label:** The experimentally tested fitness score, representing the scaled mutation effect for each mutant.
- **Source:** Label derived from [CAPE](https://doi.org/10.1021/acssynbio.4c00588)

### 🔡 Model Input Type:
Amino acid sequence; label in RhlA

### 📈 Performance (the best on test set):
**Spearman's ρ:** 0.862

---

## LoRA Configuration ⚙️
- **r:** 8
- **LoRA dropout:** 0.1
- **LoRA alpha:** 8
- **Modules to save:** `["regression"]`

## Training Configuration 🎛️

- **Optimizer:**
  - **Class:** AdamW
  - **Betas:** (0.9, 0.98)
  - **Weight decay:** 0.01
- **Learning rate:** 5e-5
- **Epochs:** 5
- **Batch size:** Adaptive