I帽igo L贸pez-Riob贸o Botana
commited on
Commit
路
10bd8d0
1
Parent(s):
6d0a43c
Update README.md
Browse files
README.md
CHANGED
@@ -3,4 +3,43 @@ license: cc-by-nc-nd-4.0
|
|
3 |
language:
|
4 |
- es
|
5 |
pipeline_tag: text-generation
|
6 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
language:
|
4 |
- es
|
5 |
pipeline_tag: text-generation
|
6 |
+
---
|
7 |
+
|
8 |
+
# DialoGPT-medium-spanish-chitchat
|
9 |
+
## Description
|
10 |
+
|
11 |
+
This is a **transformer-decoder** [gpt2 model](https://huggingface.co/gpt2), adapted for **single turn dialogue tasks**. We fine-tuned a [DialoGPT-medium](https://huggingface.co/microsoft/DialoGPT-medium) model from Microsoft, following the CLM (Causal Language Modelling) objective.
|
12 |
+
We used one of the datasets available in the [Bot Framework Tools repository](https://github.com/microsoft/botframework-cli). We processed [the professional-styled personality chat dataset in Spanish](https://github.com/microsoft/botframework-cli/blob/main/packages/qnamaker/docs/chit-chat-dataset.md), the file is available [here](https://qnamakerstore.blob.core.windows.net/qnamakerdata/editorial/spanish/qna_chitchat_professional.tsv)
|
13 |
+
---
|
14 |
+
### Example inference script
|
15 |
+
|
16 |
+
Check at this example script to run this model in inference mode:
|
17 |
+
|
18 |
+
```python
|
19 |
+
import torch
|
20 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
21 |
+
|
22 |
+
CHAT_TURNS = 5
|
23 |
+
MAX_LENGTH=1000
|
24 |
+
|
25 |
+
model = AutoModelForCausalLM.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
26 |
+
tokenizer = AutoTokenizer.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
27 |
+
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
|
28 |
+
model.to(device)
|
29 |
+
for i in range(CHAT_TURNS):
|
30 |
+
user_input = input(f"Step - {i} >> user prompt ->")
|
31 |
+
with torch.no_grad():
|
32 |
+
# User turn, where "user_input" is the question (single-turn dialogue task)
|
33 |
+
user_inputs_ids = tokenizer.encode(user_input + tokenizer.eos_token, return_tensors="pt")
|
34 |
+
user_inputs_ids = user_inputs_ids.to(device)
|
35 |
+
# The chat history adds the generated tokens for the answer
|
36 |
+
chat_history = model.generate(user_inputs_ids, max_length=MAX_LENGTH, pad_token_id=tokenizer.eos_token_id)
|
37 |
+
# decode just the last generated output tokens from the model (do not include the user prompt again)
|
38 |
+
step_model_answer = tokenizer.decode(chat_history[:, user_inputs_ids.shape[-1]:][0], skip_special_tokens=True)
|
39 |
+
print(f"Step - {i} >> DialoGPT-spanish model answer -> {step_model_answer}")
|
40 |
+
|
41 |
+
```
|
42 |
+
|
43 |
+
### Fine-tuning in different dataset
|
44 |
+
|
45 |
+
For fine-tuning this model, you can start from the DialoGPT base model
|