I帽igo L贸pez-Riob贸o Botana
commited on
Commit
路
d1b62aa
1
Parent(s):
cf9337c
Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,8 @@ pipeline_tag: text-generation
|
|
9 |
|
10 |
## Description
|
11 |
|
12 |
-
This is a **transformer-decoder** [
|
13 |
-
We used one of the datasets available in the [Bot Framework Tools repository](https://github.com/microsoft/botframework-cli). We processed [the professional-styled personality chat dataset in Spanish](https://github.com/microsoft/botframework-cli/blob/main/packages/qnamaker/docs/chit-chat-dataset.md), the file is available [here](https://qnamakerstore.blob.core.windows.net/qnamakerdata/editorial/spanish/qna_chitchat_professional.tsv)
|
14 |
|
15 |
---
|
16 |
|
@@ -23,7 +23,7 @@ import torch
|
|
23 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
24 |
|
25 |
CHAT_TURNS = 5
|
26 |
-
MAX_LENGTH=1000
|
27 |
|
28 |
model = AutoModelForCausalLM.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
29 |
tokenizer = AutoTokenizer.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
@@ -40,9 +40,7 @@ for i in range(CHAT_TURNS):
|
|
40 |
# decode just the last generated output tokens from the model (do not include the user prompt again)
|
41 |
step_model_answer = tokenizer.decode(chat_history[:, user_inputs_ids.shape[-1]:][0], skip_special_tokens=True)
|
42 |
print(f"Step - {i} >> DialoGPT-spanish model answer -> {step_model_answer}")
|
43 |
-
|
44 |
```
|
45 |
-
|
46 |
---
|
47 |
|
48 |
## Examples
|
@@ -68,13 +66,13 @@ for i in range(CHAT_TURNS):
|
|
68 |
|
69 |
---
|
70 |
|
71 |
-
## Fine-tuning in different dataset
|
72 |
|
73 |
If you want to fine-tune this model, we recommend you to start from the [DialoGPT model](https://huggingface.co/microsoft/DialoGPT-medium).
|
74 |
You can check the [original GitHub repository](https://github.com/microsoft/DialoGPT).
|
75 |
|
76 |
## Limitations
|
77 |
|
78 |
-
- This model is intended to be used **just for single-turn chitchat conversations in Spanish
|
79 |
-
- This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset
|
80 |
-
- This model generates short answers, providing general context dialogue in a professional style
|
|
|
9 |
|
10 |
## Description
|
11 |
|
12 |
+
This is a **transformer-decoder** [GPT-2 model](https://huggingface.co/gpt2), adapted for **single-turn dialogue tasks in Spanish**. We fine-tuned a [DialoGPT-medium](https://huggingface.co/microsoft/DialoGPT-medium) 345M parameters model from Microsoft, following the CLM (Causal Language Modelling) objective.
|
13 |
+
We used one of the datasets available in the [Bot Framework Tools repository](https://github.com/microsoft/botframework-cli). We processed [the professional-styled personality chat dataset in Spanish](https://github.com/microsoft/botframework-cli/blob/main/packages/qnamaker/docs/chit-chat-dataset.md), the file is available [here to download](https://qnamakerstore.blob.core.windows.net/qnamakerdata/editorial/spanish/qna_chitchat_professional.tsv)
|
14 |
|
15 |
---
|
16 |
|
|
|
23 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
24 |
|
25 |
CHAT_TURNS = 5
|
26 |
+
MAX_LENGTH = 1000
|
27 |
|
28 |
model = AutoModelForCausalLM.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
29 |
tokenizer = AutoTokenizer.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
|
|
|
40 |
# decode just the last generated output tokens from the model (do not include the user prompt again)
|
41 |
step_model_answer = tokenizer.decode(chat_history[:, user_inputs_ids.shape[-1]:][0], skip_special_tokens=True)
|
42 |
print(f"Step - {i} >> DialoGPT-spanish model answer -> {step_model_answer}")
|
|
|
43 |
```
|
|
|
44 |
---
|
45 |
|
46 |
## Examples
|
|
|
66 |
|
67 |
---
|
68 |
|
69 |
+
## Fine-tuning in a different dataset or style
|
70 |
|
71 |
If you want to fine-tune this model, we recommend you to start from the [DialoGPT model](https://huggingface.co/microsoft/DialoGPT-medium).
|
72 |
You can check the [original GitHub repository](https://github.com/microsoft/DialoGPT).
|
73 |
|
74 |
## Limitations
|
75 |
|
76 |
+
- This model is intended to be used **just for single-turn chitchat conversations in Spanish**.
|
77 |
+
- This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
|
78 |
+
- This model generates short answers, providing general context dialogue in a professional style.
|