File size: 1,190 Bytes
875ae9b 070b827 875ae9b a1a375a 875ae9b a1a375a b2a0682 875ae9b a1a375a ae44630 a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a 875ae9b a1a375a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 |
---
base_model: westlake-repl/SaProt_650M_AF2
library_name: peft
---
# Base model: [westlake-repl/SaProt_650M_AF2](https://huggingface.co/westlake-repl/SaProt_650M_AF2)
# Model Card for Model ID
<!-- Provide a quick summary of what the model is/does. -->
This model is used to predict signal peptides on each site of amino acid sequences.
### Task type
Residue level clssification
### Dataset description
The dataset is from [SignalP 6.0 predicts all five types of signal
peptides using protein language models](https://www.nature.com/articles/s41587-021-01156-3).
This dataset contains 7 classes:
S (0): Sec/SPI signal peptide | T (1): Tat/SPI or Tat/SPII signal peptide | L (2): Sec/SPII signal peptide |
P (3): Sec/SPIII signal peptide | I (4): cytoplasm | M (5): transmembrane | O (6): extracellular
### Model input type
Amino acid sequence
### Performance
test_acc: 0.96
### LoRA config
lora_dropout: 0.0
lora_alpha: 16
target_modules: ["query", "key", "value", "intermediate.dense", "output.dense"]
modules_to_save: ["classifier"]
### Training config
class: AdamW
betas: (0.9, 0.98)
weight_decay: 0.01
learning rate: 1e-4
epoch: 10
batch size: 100
precision: 16-mixed |