File size: 3,289 Bytes
52f73cc b39b096 981f3a0 b39b096 981f3a0 b39b096 981f3a0 b28b93d 981f3a0 b28b93d b39b096 69b5877 981f3a0 b39b096 52f73cc 981f3a0 b39b096 981f3a0 52f73cc b39b096 52f73cc 981f3a0 b39b096 52f73cc b39b096 981f3a0 b39b096 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 |
---
languages:
- en
license:
- cc-by-nc-sa-4.0
- apache-2.0
tags:
- grammar
- spelling
- punctuation
- error-correction
- grammar synthesis
- FLAN
- C4
datasets:
- C4
widget:
- text: "Me go to the store yesterday and buy many thing. I saw a big dog but he no bark at me. Then I walk home and eat my lunch, it was delicious sandwich. After that, I watch TV and see a funny show about cat who can talk. I laugh so hard I cry. Then I go to bed but I no can sleep because I too excited about the cat show."
example_title: "Long-Text"
- text: "Me and my family go on a trip to the mountains last week. We drive for many hours and finally reach our cabin. The cabin was cozy and warm, with a fireplace and big windows. We spend our days hiking and exploring the forest. At night, we sit by the fire and tell story. It was a wonderful vacation."
example_title: "Long-Text"
- text: "so em if we have an now so with fito ringina know how to estimate the tren given the ereafte mylite trend we can also em an estimate is nod s i again tort watfettering an we have estimated the trend an called wot to be called sthat of exty right now we can and look at wy this should not hare a trend i becan we just remove the trend an and we can we now estimate tesees ona effect of them exty"
example_title: "Transcribed Audio Example"
- text: "My coworker said he used a financial planner to help choose his stocks so he wouldn't loose money."
example_title: "incorrect word choice"
- text: "good so hve on an tadley i'm not able to make it to the exla session on monday this week e which is why i am e recording pre recording an this excelleision and so to day i want e to talk about two things and first of all em i wont em wene give a summary er about ta ohow to remove trents in these nalitives from time series"
example_title: "lowercased audio transcription output"
parameters:
max_length: 128
min_length: 4
num_beams: 8
repetition_penalty: 1.21
length_penalty: 1
early_stopping: True
---
# Grammar-Synthesis-Enhanced: FLAN-t5
<a href="https://colab.research.google.com/gist/pszemraj/5dc89199a631a9c6cfd7e386011452a0/demo-flan-t5-large-grammar-synthesis.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>
A fine-tuned version of [google/flan-t5-large](https://huggingface.co/google/flan-t5-large) for grammar correction on an expanded version of the [JFLEG](https://paperswithcode.com/dataset/jfleg) dataset and further fine-tuned using the [C4 200M](https://www.tensorflow.org/datasets/community_catalog/huggingface/c4) dataset. [Demo](https://huggingface.co/spaces/pszemraj/FLAN-grammar-correction) on HF spaces.
## Example
![example](https://i.imgur.com/PIhrc7E.png)
Compare vs. the original [grammar-synthesis-large](https://huggingface.co/pszemraj/grammar-synthesis-large).
---
## Usage in Python
> There's a colab notebook that already has this basic version implemented (_click on the Open in Colab button_)
After `pip install transformers` run the following code:
```python
from transformers import pipeline
corrector = pipeline(
'text2text-generation',
'farelzii/GEC_Test_v1',
)
raw_text = 'i can has cheezburger'
results = corrector(raw_text)
print(results)
|