cmarkea
/

bloomz-560m-guardrail

@@ -32,7 +32,7 @@ Where sigma is the sigmoid function and O represents the set of learning observa
 Benchmark
 ---------
-As the scores range from 0 to 1, a performance measure such as MAE or RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 750 comments not seen during training.
 | Model                                                                         | Language | Obsecene (x100)         | Sexual explicit (x100)        | Identity attack (x100)        | Insult (x100)        | Threat (x100)        | Mean |
 |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|------|
@@ -43,6 +43,15 @@ As the scores range from 0 to 1, a performance measure such as MAE or RMSE may b
 With a correlation of approximately 65 for the 560m model and approximately 80 for the 3b model, the output is highly correlated with the judges' scores.
 How to Use Blommz-560m-guardrail
 --------------------------------

 Benchmark
 ---------
+As the scores range from 0 to 1, a performance measure such as RMSE may be challenging to interpret. Therefore, Pearson's inter-correlation was chosen as a measure. Pearson's inter-correlation is a measure ranging from -1 to 1, where 0 represents no correlation, -1 represents perfect negative correlation, and 1 represents perfect positive correlation. The goal is to quantitatively measure the correlation between the model's scores and the scores assigned by judges for 730 comments not seen during training.
 | Model                                                                         | Language | Obsecene (x100)         | Sexual explicit (x100)        | Identity attack (x100)        | Insult (x100)        | Threat (x100)        | Mean |
 |-------------------------------------------------------------------------------|----------|:-----------------------:|-------------------------------|-------------------------------|----------------------|----------------------|------|
 With a correlation of approximately 65 for the 560m model and approximately 80 for the 3b model, the output is highly correlated with the judges' scores.
+Now we will focus on the MAE (Mean Absolute Error) score to measure the average gap of the estimation error.
+| Model                                                                         | Language | Obsecene         | Sexual explicit       | Identity attack      | Insult       | Threat     | Mean |
+|-------------------------------------------------------------------------------|----------|:----------------:|-----------------------|----------------------|--------------|------------|------|
+| [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | French   | 0.06             | 0.03                  | 0.03                 | 0.13         | 0.04       | 0.06 |
+| [Bloomz-560m-guardrail](https://huggingface.co/cmarkea/bloomz-560m-guardrail) | English  | 0.06             | 0.03                  | 0.03                 | 0.14         | 0.04       | 0.06 |
+| [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail)     | French   | 0.05             | 0.02                  | 0.02                 | 0.11         | 0.03       | 0.05 |
+| [Bloomz-3b-guardrail](https://huggingface.co/cmarkea/bloomz-3b-guardrail)     | English  | 0.05             | 0.03                  | 0.02                 | 0.12         | 0.03       | 0.05 |
 How to Use Blommz-560m-guardrail
 --------------------------------