File size: 1,660 Bytes
ea5bbeb
 
 
 
 
 
 
 
ee3b783
 
 
 
 
 
 
 
 
 
 
 
 
fcb1bd0
ea5bbeb
 
ee3b783
ea5bbeb
ee3b783
 
ea5bbeb
 
 
ee3b783
ea5bbeb
ee3b783
 
 
ea5bbeb
ee3b783
 
ea5bbeb
ee3b783
 
 
 
ea5bbeb
ee3b783
 
 
ea5bbeb
ee3b783
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
library_name: peft
datasets:
- InstaDeepAI/nucleotide_transformer_downstream_tasks_revised
metrics:
- f1
base_model:
- tattabio/gLM2_150M
model-index:
- name: alejandralopezsosa/gLM2_150M-promoter_tata-lora
  results:
  - task:
      type: sequence-classification
    dataset:
      type: InstaDeepAI/nucleotide_transformer_downstream_tasks_revised
      name: nucleotide_transformer_downstream_tasks_revised
      config: promoter_tata
      split: test
      revision: c8c94743d3d2838b943398ee676247ac2f774122
    metrics:
      - type: f1
        value: 0.9811
---

# gLM2 LoRA adapter for TATA promoter recognition

This model demonstrates the use of [gLM2_150M](https://huggingface.co/tattabio/gLM2_150M) embeddings for downstream classification.
The model is fine-tuned using LoRA and obtains an F1 score of 98.11% on the TATA promoter task from the [Nucleotide Transformer benchmarks](https://huggingface.co/datasets/InstaDeepAI/nucleotide_transformer_downstream_tasks_revised).

## How to Get Started with the Model

Use the code below to use the model for inference:

```python
from peft import PeftModel
from transformers import AutoConfig, AutoModelForSequenceClassification, AutoModel

glm2 = "tattabio/gLM2_150M"
adapter = "alejandralopezsosa/gLM2_150M-promoter_tata-lora"

load_kwargs = {
    'trust_remote_code': True,
    'torch_dtype': torch.bfloat16,
}

config = AutoConfig.from_pretrained(adapter, **load_kwargs)
base_model = AutoModelForSequenceClassification.from_config(config, **load_kwargs)
base_model.glm2 = AutoModel.from_pretrained("tattabio/gLM2_150M", **load_kwargs)

model = PeftModel.from_pretrained(base_model, adapter)
```