MKaan commited on
Commit
eeca4eb
1 Parent(s): 4c9e8c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -3
README.md CHANGED
@@ -16,7 +16,7 @@ It achieves the following results on the evaluation set:
16
  - F1 Score: 0.686
17
 
18
  ## Model description
19
- The model takes procurement descriptions written in any of [104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) and classifies into 45 sector classes represented by [CPV(Common Procurement Vocabulary)](https://simap.ted.europa.eu/en_GB/web/simap/cpv) code descriptions.
20
 
21
  | Common Procurement Vocabulary |
22
  |:-----------------------------|
@@ -67,10 +67,15 @@ The model takes procurement descriptions written in any of [104 languages](https
67
  | Transport services (excl. Waste transport). 💺
68
 
69
  ## Intended uses & limitations
70
- More information needed
 
71
  ## Training and evaluation data
72
- More information needed
 
 
 
73
  ## Training procedure
 
74
 
75
  ### Training hyperparameters
76
  The following hyperparameters were used during training:
 
16
  - F1 Score: 0.686
17
 
18
  ## Model description
19
+ The model takes procurement descriptions written in any of [104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) and classifies them into 45 sector classes represented by [CPV(Common Procurement Vocabulary)](https://simap.ted.europa.eu/en_GB/web/simap/cpv) code descriptions as listed below.
20
 
21
  | Common Procurement Vocabulary |
22
  |:-----------------------------|
 
67
  | Transport services (excl. Waste transport). 💺
68
 
69
  ## Intended uses & limitations
70
+ Input description should be written in any of [the 104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) that MBERT supports.
71
+
72
  ## Training and evaluation data
73
+ The whole data consists of 744,360 rows. Shuffled and split into train and validation sets by using 80%/20% manner.
74
+ Each description represents a unique contract notice description awarded between 2011 and 2018.
75
+ Both training and validation data have contract notice descriptions written in 22 European Languages. (Malta and Irish are extracted due to scarcity compared to whole data)
76
+
77
  ## Training procedure
78
+ The training procedure has been completed on Google Cloud V3-8 TPUs. Thanks [Google](https://sites.research.google/trc/about/) for giving the access to Cloud TPUs
79
 
80
  ### Training hyperparameters
81
  The following hyperparameters were used during training: