Spanish to Quechua translator
This model is a finetuned version of the t5-small.
Model description
t5-small-finetuned-spanish-to-quechua has trained for 46 epochs with 102 747 sentences, the validation was performed with 12 844 sentences and 12 843 sentences were used for the test.
Intended uses & limitations
A large part of the dataset has been extracted from biblical texts, which makes the model perform better with certain types of sentences.
How to use
You can import this model as follows:
>>> from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
>>> model_name = 'hackathon-pln-es/t5-small-finetuned-spanish-to-quechua'
>>> model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
>>> tokenizer = AutoTokenizer.from_pretrained(model_name)
To translate you can do:
>>> sentence = "Entonces dijo"
>>> input = tokenizer(sentence, return_tensors="pt")
>>> output = model.generate(input["input_ids"], max_length=40, num_beams=4, early_stopping=True)
>>> print('Original Sentence: {} \nTranslated sentence: {}'.format(sentence, tokenizer.decode(output[0])))
Limitations and bias
Actually this model only can translate to Quechua of Ayacucho.
Training data
For train this model we use Spanish to Quechua dataset
Evaluation results
We obtained the following metrics during the training process:
- eval_bleu = 2.9691
- eval_loss = 1.2064628601074219
Team members
- Downloads last month
- 22
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.