Update README.md
Browse files
README.md
CHANGED
@@ -16,7 +16,7 @@ It achieves the following results on the evaluation set:
|
|
16 |
- F1 Score: 0.686
|
17 |
|
18 |
## Model description
|
19 |
-
The model takes procurement descriptions written in any of [104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) and classifies into 45 sector classes represented by [CPV(Common Procurement Vocabulary)](https://simap.ted.europa.eu/en_GB/web/simap/cpv) code descriptions.
|
20 |
|
21 |
| Common Procurement Vocabulary |
|
22 |
|:-----------------------------|
|
@@ -67,10 +67,15 @@ The model takes procurement descriptions written in any of [104 languages](https
|
|
67 |
| Transport services (excl. Waste transport). 💺
|
68 |
|
69 |
## Intended uses & limitations
|
70 |
-
|
|
|
71 |
## Training and evaluation data
|
72 |
-
|
|
|
|
|
|
|
73 |
## Training procedure
|
|
|
74 |
|
75 |
### Training hyperparameters
|
76 |
The following hyperparameters were used during training:
|
|
|
16 |
- F1 Score: 0.686
|
17 |
|
18 |
## Model description
|
19 |
+
The model takes procurement descriptions written in any of [104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) and classifies them into 45 sector classes represented by [CPV(Common Procurement Vocabulary)](https://simap.ted.europa.eu/en_GB/web/simap/cpv) code descriptions as listed below.
|
20 |
|
21 |
| Common Procurement Vocabulary |
|
22 |
|:-----------------------------|
|
|
|
67 |
| Transport services (excl. Waste transport). 💺
|
68 |
|
69 |
## Intended uses & limitations
|
70 |
+
Input description should be written in any of [the 104 languages](https://github.com/google-research/bert/blob/master/multilingual.md#list-of-languages) that MBERT supports.
|
71 |
+
|
72 |
## Training and evaluation data
|
73 |
+
The whole data consists of 744,360 rows. Shuffled and split into train and validation sets by using 80%/20% manner.
|
74 |
+
Each description represents a unique contract notice description awarded between 2011 and 2018.
|
75 |
+
Both training and validation data have contract notice descriptions written in 22 European Languages. (Malta and Irish are extracted due to scarcity compared to whole data)
|
76 |
+
|
77 |
## Training procedure
|
78 |
+
The training procedure has been completed on Google Cloud V3-8 TPUs. Thanks [Google](https://sites.research.google/trc/about/) for giving the access to Cloud TPUs
|
79 |
|
80 |
### Training hyperparameters
|
81 |
The following hyperparameters were used during training:
|