MKaan commited on
Commit
68a94aa
·
1 Parent(s): 31e3bfe

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -8
README.md CHANGED
@@ -67,14 +67,14 @@ The model takes procurement descriptions written in any of [104 languages](https
67
  | Transport services (excl. Waste transport). 💺
68
 
69
  ## Intended uses & limitations
70
- Input description should be written in any of [the 104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) that MBERT supports.
71
- The model is just evaluated in 22 languages. Thus there is no information about the performances in other languages.
72
- The domain is also restricted by the awarded procurement notice descriptions in European Union. Evaluating on whole document texts might change the performance.
73
 
74
  ## Training and evaluation data
75
- The whole data consists of 744,360 rows. Shuffled and split into train and validation sets by using 80%/20% manner.
76
- Each description represents a unique contract notice description awarded between 2011 and 2018.
77
- Both training and validation data have contract notice descriptions written in 22 European Languages. (Malta and Irish are extracted due to scarcity compared to whole data)
78
 
79
  ## Training procedure
80
  The training procedure has been completed on Google Cloud V3-8 TPUs. Thanks [Google](https://sites.research.google/trc/about/) for giving the access to Cloud TPUs
@@ -117,5 +117,4 @@ The following hyperparameters were used during training:
117
  | SV| 0.607| 3326|
118
  | DA| 0.603| 1925|
119
  | FR| 0.601| 33113|
120
- | ET| 0.572| 458||
121
-
 
67
  | Transport services (excl. Waste transport). 💺
68
 
69
  ## Intended uses & limitations
70
+ - Input description should be written in any of [the 104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) that MBERT supports.
71
+ - The model is just evaluated in 22 languages. Thus there is no information about the performances in other languages.
72
+ - The domain is also restricted by the awarded procurement notice descriptions in European Union. Evaluating on whole document texts might change the performance.
73
 
74
  ## Training and evaluation data
75
+ - The whole data consists of 744,360 rows. Shuffled and split into train and validation sets by using 80%/20% manner.
76
+ - Each description represents a unique contract notice description awarded between 2011 and 2018.
77
+ - Both training and validation data have contract notice descriptions written in 22 European Languages. (Malta and Irish are extracted due to scarcity compared to whole data)
78
 
79
  ## Training procedure
80
  The training procedure has been completed on Google Cloud V3-8 TPUs. Thanks [Google](https://sites.research.google/trc/about/) for giving the access to Cloud TPUs
 
117
  | SV| 0.607| 3326|
118
  | DA| 0.603| 1925|
119
  | FR| 0.601| 33113|
120
+ | ET| 0.572| 458||