metadata

languages:
  - en
license:
  - cc-by-nc-sa-4.0
  - apache-2.0
tags:
  - grammar
  - spelling
  - punctuation
  - error-correction
  - grammar synthesis
  - FLAN
  - C4
datasets:
  - C4
widget:
  - text: >-
      Me go to the store yesterday and buy many thing. I saw a big dog but he no
      bark at me. Then I walk home and eat my lunch, it was delicious sandwich.
      After that, I watch TV and see a funny show about cat who can talk. I
      laugh so hard I cry. Then I go to bed but I no can sleep because I too
      excited about the cat show.
    example_title: Long-Text
  - text: >-
      Me and my family go on a trip to the mountains last week. We drive for
      many hours and finally reach our cabin. The cabin was cozy and warm, with
      a fireplace and big windows. We spend our days hiking and exploring the
      forest. At night, we sit by the fire and tell story. It was a wonderful
      vacation.
    example_title: Long-Text
  - text: >-
      so em if we have an now so with fito ringina know how to estimate the tren
      given the ereafte mylite trend we can also em an estimate is nod s i again
      tort watfettering an we have estimated the trend an called wot to be
      called sthat of exty right now we can and look at wy this should not hare
      a trend i becan we just remove the trend an and we can we now estimate
      tesees ona effect of them exty
    example_title: Transcribed Audio Example
  - text: >-
      My coworker said he used a financial planner to help choose his stocks so
      he wouldn't loose money.
    example_title: incorrect word choice
  - text: >-
      good so hve on an tadley i'm not able to make it to the exla session on
      monday this week e which is why i am e recording pre recording an this
      excelleision and so to day i want e to talk about two things and first of
      all em i wont em wene give a summary er about ta ohow to remove trents in
      these nalitives from time series
    example_title: lowercased audio transcription output
parameters:
  max_length: 128
  min_length: 4
  num_beams: 8
  repetition_penalty: 1.21
  length_penalty: 1
  early_stopping: true

Grammar-Synthesis-Enhanced: FLAN-t5

A fine-tuned version of google/flan-t5-large for grammar correction on an expanded version of the JFLEG dataset and further fine-tuned using the C4 200M dataset. Demo on HF spaces.

Example

Compare vs. the original grammar-synthesis-large.

Usage in Python

There's a colab notebook that already has this basic version implemented (click on the Open in Colab button)

After pip install transformers run the following code:

from transformers import pipeline

corrector = pipeline(
              'text2text-generation',
              'farelzii/GEC_Test_v1',
              )
raw_text = 'i can has cheezburger'
results = corrector(raw_text)
print(results)