Iñigo López-Riobóo Botana
Update README.md
b27c12e
|
raw
history blame
4.2 kB
metadata
license: cc-by-nc-nd-4.0
language:
  - es
pipeline_tag: text-generation
tags:
  - dialogue
  - conversational
  - gpt
  - gpt2
  - text-generation
inference: false

DialoGPT-medium-spanish-chitchat

Description

This is a transformer-decoder GPT-2 model, adapted for single-turn dialogue tasks in Spanish. We fine-tuned a DialoGPT-medium 345M parameters model from Microsoft, following the CLM (Causal Language Modelling) objective. We used one of the datasets available in the Bot Framework Tools repository. We processed the professional-styled personality chat dataset in Spanish, the file is available here to download


Example inference script

Check at this example script to run this model in inference mode

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

CHAT_TURNS = 5
MAX_LENGTH = 1000

model = AutoModelForCausalLM.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
tokenizer = AutoTokenizer.from_pretrained('ITG/DialoGPT-medium-spanish-chitchat')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device)
for i in range(CHAT_TURNS):
  user_input = input(f"Step - {i} >> user prompt ->")
  with torch.no_grad():
    # User turn, where "user_input" is the question (single-turn dialogue task)
    user_inputs_ids = tokenizer.encode(user_input + tokenizer.eos_token, return_tensors="pt")
    user_inputs_ids = user_inputs_ids.to(device)
    # The chat history adds the generated tokens for the answer
    chat_history = model.generate(user_inputs_ids, max_length=MAX_LENGTH, pad_token_id=tokenizer.eos_token_id)
    # decode just the last generated output tokens from the model (do not include the user prompt again)
    step_model_answer = tokenizer.decode(chat_history[:, user_inputs_ids.shape[-1]:][0], skip_special_tokens=True)
  print(f"Step - {i} >> DialoGPT-spanish model answer -> {step_model_answer}")

Examples

Single-turn generation examples from an interactive environment

Role Response
User Hola, buenos dias
Bot Buenos días
User ¿Qué tal estás?
Bot Fenomenal, gracias.
User ¡Tengo hambre!
Bot Quizá picar algo te ayude.
User Estoy triste
Bot Siento escuchar eso.
User ¿Tienes personalidad?
Bot Soy un bot.
User ¡Cuenta un chiste!
Bot En realidad no tengo tanta gracia.
User Jajajajajajajaja
Bot ¡Me alegro de que te rías!

Fine-tuning hyper-parameters

Hyper-parameter Value
Validation partition (%) 20%
Training batch size 8
Learning rate 5e-4
Max training epochs 20
Warmup training steps (%) 6%
Weight decay 0.01
Optimiser (beta1, beta2, epsilon) AdamW (0.9, 0.999, 1e-08)
Monitoring metric (delta, patience) validation loss (0.1, 3)

Fine-tuning in a different dataset or style

If you want to fine-tune your own dialogue model, we recommend you to start from the DialoGPT model. You can check the original GitHub repository.

Limitations

  • This model is intended to be used just for single-turn chitchat conversations in Spanish.
  • This model's generation capabilities are limited to the extent of the aforementioned fine-tuning dataset.
  • This model generates short answers, providing general context dialogue in a professional style.