File size: 1,335 Bytes
b3ea9f0
796d58b
 
 
 
 
 
b3ea9f0
 
 
dc49760
b3ea9f0
796d58b
b3ea9f0
796d58b
b3ea9f0
796d58b
b3ea9f0
796d58b
b3ea9f0
796d58b
b3ea9f0
796d58b
b3ea9f0
796d58b
 
 
b3ea9f0
796d58b
 
b3ea9f0
796d58b
 
b3ea9f0
796d58b
 
b3ea9f0
796d58b
 
 
b3ea9f0
796d58b
 
b3ea9f0
796d58b
388000a
b3ea9f0
0ebfbf9
796d58b
147772a
 
 
3354a21
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
datasets:
- pubmed
language:
- en
tags:
- BERT
---
# Model Card for Model ID

base_model : [google-bert/bert-large-uncased](https://huggingface.co/google-bert/bert-large-uncased)

hidden_size : 1024

max_position_embeddings : 512

num_attention_heads : 16

num_hidden_layers : 24

vocab_size : 30522

# Basic usage

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import numpy as np

# match tag
id2tag = {0:'O', 1:'B_MT', 2:'I_MT'}

# load model & tokenizer
MODEL_NAME = 'MDDDDR/bert_large_uncased_NER'

model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME)
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)

# prepare input
text = 'mental disorder can also contribute to the development of diabetes through various mechanism including increased stress, poor self care behavior, and adverse effect on glucose metabolism.'
tokenized = tokenizer(text, return_tensors='pt')

# forward pass
output = model(**tokenized)

# result
pred = np.argmax(output[0].cpu().detach().numpy(), axis=2)[0][1:-1]

# check pred
for txt, pred in zip(tokenizer.tokenize(text), pred):
    print("{}\t{}".format(id2tag[pred], txt))
    # B_MT mental 
    # B_MT disorder 
```

## Framework versions
- transformers : 4.39.1
- torch : 2.1.0+cu121
- datasets : 2.18.0
- tokenizers : 0.15.2
- numpy : 1.20.0