This is a fine-tuned version of Multilingual Bart trained (610M) on English in particular on the public dataset FCE for Grammatical Error Correction.
To initialize the model:
from transformers import MBartForConditionalGeneration, MBart50TokenizerFast
model = MBartForConditionalGeneration.from_pretrained("MRNH/mbart-english-grammar-corrector")
Use the tokenizer:
tokenizer = MBart50TokenizerFast.from_pretrained("MRNH/mbart-english-grammar-corrector", src_lang="en_XX", tgt_lang="en_XX")
input = tokenizer("I was here yesterday to studying",
text_target="I was here yesterday to study", return_tensors='pt')
To generate text using the model:
output = model.generate(input["input_ids"],attention_mask=input["attention_mask"],
forced_bos_token_id=tokenizer_it.lang_code_to_id["en_XX"])
Training of the model is performed using the following loss computation based on the hidden state output h:
h.logits, h.loss = model(input_ids=input["input_ids"],
attention_mask=input["attention_mask"],
labels=input["labels"])
- Downloads last month
- 21,221
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.