Skolkovo Institute of Science and Technology
commited on
Commit
·
544b646
1
Parent(s):
6e32a9f
Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ In this task, the model gets the string with text with the error and the exact s
|
|
16 |
|
17 |
## Model training details
|
18 |
|
19 |
-
|
20 |
|
21 |
The data was provided in the following way
|
22 |
|
@@ -34,7 +34,7 @@ I want to stop smoking during driving bicycle . 23:29 A <gerund> does not normal
|
|
34 |
|
35 |
Grammar termins are highlighted with '< ... >' marks and word examples - with '<< ... >>'
|
36 |
|
37 |
-
|
38 |
|
39 |
We lowercased the text, split it from any punctuation, including task specific marks (<< >>) and explicitly pointed out the error in the original text using << >>.
|
40 |
|
@@ -44,6 +44,11 @@ the smoke < < flow > > < < my > > face . 10:17 When the < verb > < < flow > > is
|
|
44 |
i want to stop smoking < < during > > driving bicycle . 23:29 a < gerund > does not normally follow the < preposition > < < during > > . think of an expression using the < conjunction > ' while ' instead of a < preposition > .
|
45 |
```
|
46 |
|
|
|
|
|
|
|
|
|
|
|
47 |
|
48 |
## How to use
|
49 |
|
@@ -86,4 +91,14 @@ def paraphrase(text, model, temperature=1.0, beams=3):
|
|
86 |
# expected output: ["a gerund > does not normally follow the preposition > during > >. think of an expression using the conjunction >'while'instead of a preposition >."]
|
87 |
|
88 |
|
89 |
-
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
16 |
|
17 |
## Model training details
|
18 |
|
19 |
+
#### Data
|
20 |
|
21 |
The data was provided in the following way
|
22 |
|
|
|
34 |
|
35 |
Grammar termins are highlighted with '< ... >' marks and word examples - with '<< ... >>'
|
36 |
|
37 |
+
#### Data preprocessing
|
38 |
|
39 |
We lowercased the text, split it from any punctuation, including task specific marks (<< >>) and explicitly pointed out the error in the original text using << >>.
|
40 |
|
|
|
44 |
i want to stop smoking < < during > > driving bicycle . 23:29 a < gerund > does not normally follow the < preposition > < < during > > . think of an expression using the < conjunction > ' while ' instead of a < preposition > .
|
45 |
```
|
46 |
|
47 |
+
#### Data augmentation
|
48 |
+
|
49 |
+
The main feature of our training pipeline was data augmentation. The idea of the augmentation is as follows: we cut the existing text with error after the last word which was syntactically connected to the words inside the error span (syntactic dependencies were automatically parsed with spacy) and this cut version of the text with error was used as a prompt for language model (we used [GPT-Neo 1.3B](https://huggingface.co/EleutherAI/gpt-neo-1.3B)).
|
50 |
+
|
51 |
+
Using both initial and augmented data we fine-tuned [t5-large](https://huggingface.co/t5-large).
|
52 |
|
53 |
## How to use
|
54 |
|
|
|
91 |
# expected output: ["a gerund > does not normally follow the preposition > during > >. think of an expression using the conjunction >'while'instead of a preposition >."]
|
92 |
|
93 |
|
94 |
+
```
|
95 |
+
|
96 |
+
|
97 |
+
## Licensing Information
|
98 |
+
|
99 |
+
[Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License][cc-by-nc-sa].
|
100 |
+
|
101 |
+
[![CC BY-NC-SA 4.0][cc-by-nc-sa-image]][cc-by-nc-sa]
|
102 |
+
|
103 |
+
[cc-by-nc-sa]: http://creativecommons.org/licenses/by-nc-sa/4.0/
|
104 |
+
[cc-by-nc-sa-image]: https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png
|