--- base_model: westlake-repl/SaProt_35M_AF2 library_name: peft --- # Model Card for Model ID This model is trained on a sigle site deep mutation scanning dataset and can be used to predict fitness score of mutant amino acid sequence of protein AsCas12f. ## Protein Function AsCas12a is widely utilized as genome-editing tools in human cells. ### Task type protein level regression ### Dataset description The dataset is from [An AsCas12f-based compact genome-editing tool derived by deep mutational scanning and structural analysis](https://doi.org/10.1016/j.cell.2023.08.031). Label means fitness score of each mutant amino acid sequence. Ranging from negative infinity to positive infinity. The wildtype sequence has fitness 1. If the effect larger than 1 represents high fitness, smaller than 1 represents low fitness. ### Model input type Amino acid sequence ### Performance 0.60 Spearman's ρ ### LoRA config lora_dropout: 0.0 lora_alpha: 16 target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"] modules_to_save: ["classifier"] ### Training config class: AdamW betas: (0.9, 0.98) weight_decay: 0.01 learning rate: 1e-4 epoch: 50 batch size: 36 precision: 16-mixed