sapienzanlp
/

Minerva-7B-base-v1.0

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

edobobo commited on Dec 2, 2024

Commit

5d3cd6b

·

verified ·

1 Parent(s): 722eeb1

Update README.md

added evaluation images

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -130,11 +130,14 @@ Minerva-7B-base-v1.0 was trained using [llm-foundry 0.8.0](https://github.com/ri
 ## Model Evaluation
-We assessed our model using the [LM-Evaluation-Harness](https://github.com/EleutherAI/lm-evaluation-harness) library, which serves as a comprehensive framework for testing generative language models across a wide range of evaluation tasks.
-All the reported benchmark data was already present in the LM-Evaluation-Harness suite.
-_Scores will be available at later stage._
 <!-- **Italian** Data: -->
 <!-- | Task | Accuracy |

 ## Model Evaluation
+For Minerva's evaluation process, we utilized ITA-Bench, a new evaluation suite to test the capabilities of Italian-speaking models.
+ITA-Bench is a collection of 18 benchmarks that assess the performance of language models on various tasks, including scientific knowledge,
+commonsense reasoning, and mathematical problem-solving.
+<div style={{ display: 'flex', justifyContent: 'space-around' }}>
+    <img src="https://huggingface.co/sapienzanlp/Minerva-7B-base-v1.0/resolve/main/Minerva%20LLMs%20Results%20Base%20Models.png" alt="Results on base models" style={{ width: '45%' }}></img>
+    <img src="https://huggingface.co/sapienzanlp/Minerva-7B-base-v1.0/resolve/main/Minerva%20LLMs%20Results%20All%20Base%20Models.png" alt="Results on base models" style={{ width: '45%' }}></img>
+</div>
 <!-- **Italian** Data: -->
 <!-- | Task | Accuracy |