Sakil commited on
Commit
36ca584
·
1 Parent(s): a56ddd6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -0
README.md CHANGED
@@ -1,3 +1,18 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ language:
4
+ - en
5
+ library_name: sentence-transformers
6
+ pipeline_tag: translation
7
  ---
8
+
9
+ # Dataset Collection:
10
+ * The English-French Translation Dataset is collected from Kaggle.[Dataset](https://www.kaggle.com/datasets/dhruvildave/en-fr-translation-dataset).
11
+ About Dataset:
12
+ French/English parallel texts for training translation models.
13
+ Over 22.5 million sentences in French and English.Dataset created
14
+ by Chris Callison-Burch, who crawled millions of web pages and
15
+ then used a set of simple heuristics to transform French URLs onto English URLs,
16
+ and assumed that these documents are translations of each other.
17
+ This is the main dataset of Workshop on Statistical Machine Translation (WML) 2015 Dataset
18
+ that can be used for Machine Translation and Language Models. Refer to the paper here:[PDF](https://www.statmt.org/wmt15/pdf/WMT01.pdf)