|
--- |
|
base_model: westlake-repl/SaProt_35M_AF2 |
|
library_name: peft |
|
--- |
|
|
|
# Model Card for Model ID |
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
This model is trained on a sigle site deep mutation scanning dataset and |
|
can be used to predict fitness score of mutant amino acid sequence of protein [DLG4_RAT](https://www.uniprot.org/uniprotkb/P31016/entry) (Disks large homolog 4). |
|
|
|
## Function |
|
Postsynaptic scaffolding protein that plays a critical role in synaptogenesis and synaptic plasticity by providing a platform for the postsynaptic clustering of crucial synaptic proteins. |
|
Interacts with the cytoplasmic tail of NMDA receptor subunits and shaker-type potassium channels. |
|
Required for synaptic plasticity associated with NMDA receptor signaling. Overexpression or depletion of DLG4 changes the ratio of excitatory to inhibitory synapses |
|
in hippocampal neurons. May reduce the amplitude of ASIC3 acid-evoked currents by retaining the channel intracellularly. |
|
May regulate the intracellular trafficking of ADR1B. |
|
|
|
|
|
### Task type |
|
protein level regression |
|
### Dataset description |
|
The dataset is from [Deep generative models of genetic variation capture the effects of mutations](https://www.nature.com/articles/s41592-018-0138-4). |
|
And can also be found on [SaprotHub dataset](https://huggingface.co/datasets/SaProtHub/DMS_DLG4_RAT). |
|
|
|
Label means fitness score of each mutant amino acid sequence. |
|
|
|
### Model input type |
|
Amino acid sequence |
|
### Performance |
|
0.70 Spearman's ρ |
|
|
|
### LoRA config |
|
lora_dropout: 0.0 |
|
|
|
lora_alpha: 16 |
|
|
|
target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"] |
|
|
|
modules_to_save: ["classifier"] |
|
|
|
### Training config |
|
class: AdamW |
|
|
|
betas: (0.9, 0.98) |
|
|
|
weight_decay: 0.01 |
|
|
|
learning rate: 1e-4 |
|
|
|
epoch: 50 |
|
|
|
batch size: 128 |
|
|
|
precision: 16-mixed |