Update README.md
Browse files
README.md
CHANGED
@@ -110,7 +110,19 @@ We used a collection of datasets of Natural Language Inference as training data:
|
|
110 |
- [SNLI](https://nlp.stanford.edu/projects/snli/), automatically translated
|
111 |
- [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/), automatically translated
|
112 |
|
113 |
-
The whole dataset used is available [here](https://huggingface.co/datasets/hackathon-pln-es/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
114 |
|
115 |
**DataLoader**:
|
116 |
|
|
|
110 |
- [SNLI](https://nlp.stanford.edu/projects/snli/), automatically translated
|
111 |
- [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/), automatically translated
|
112 |
|
113 |
+
The whole dataset used is available [here](https://huggingface.co/datasets/hackathon-pln-es/nli-es).
|
114 |
+
|
115 |
+
Here we leave the trick we used to increase the amount of data for training here:
|
116 |
+
```
|
117 |
+
for row in reader:
|
118 |
+
if row['language'] == 'es':
|
119 |
+
|
120 |
+
sent1 = row['sentence1'].strip()
|
121 |
+
sent2 = row['sentence2'].strip()
|
122 |
+
|
123 |
+
add_to_samples(sent1, sent2, row['gold_label'])
|
124 |
+
add_to_samples(sent2, sent1, row['gold_label']) #Also add the opposite
|
125 |
+
```
|
126 |
|
127 |
**DataLoader**:
|
128 |
|