Update README.md
Browse files
README.md
CHANGED
@@ -22,6 +22,10 @@ Training
|
|
22 |
--------
|
23 |
|
24 |
The training dataset consists of 500k examples of comments in English and 500k comments in French (translated by Google Translate), each annotated with a toxicity severity gradient. The dataset used is provided by [Jigsaw](https://jigsaw.google.com/) as part of a Kaggle competition : [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/data). Since the scores represent severity gradients, regression was preferred using the following loss function:
|
|
|
|
|
|
|
|
|
25 |
|
26 |
Benchmark
|
27 |
---------
|
|
|
22 |
--------
|
23 |
|
24 |
The training dataset consists of 500k examples of comments in English and 500k comments in French (translated by Google Translate), each annotated with a toxicity severity gradient. The dataset used is provided by [Jigsaw](https://jigsaw.google.com/) as part of a Kaggle competition : [Jigsaw Unintended Bias in Toxicity Classification](https://www.kaggle.com/competitions/jigsaw-unintended-bias-in-toxicity-classification/data). Since the scores represent severity gradients, regression was preferred using the following loss function:
|
25 |
+
$$loss=l_{\mathrm{obscene}}+l_{\mathrm{sexual\_explicit}}+l_{\mathrm{identity\_attack}}+l_{\mathrm{insult}}+l_{\mathrm{threat}}$$
|
26 |
+
with
|
27 |
+
$$l_i=\frac{1}{\vert\mathcal{O}\vert}\sum_{o\in\mathcal{O}}\vert\mathrm{score}_{i,o}-\sigma(\mathrm{logit}_{i,o})\vert$$
|
28 |
+
Where sigma is the sigmoid function and O represents the set of learning observations.
|
29 |
|
30 |
Benchmark
|
31 |
---------
|