sapienzanlp
/

Minerva-7B-instruct-v1.0

Text Generation

text-generation-inference

Model card Files Files and versions Community

edobobo commited on Dec 2, 2024

Commit

454ab56

·

verified ·

1 Parent(s): 8f68678

Update README.md

Files changed (1) hide show

README.md +7 -4

README.md CHANGED Viewed

@@ -166,11 +166,14 @@ For more details please check [our tech report](https://nlp.uniroma1.it/minerva/
 ## Model Evaluation
-We assessed our model using the [LM-Evaluation-Harness](https://github.com/EleutherAI/lm-evaluation-harness) library, which serves as a comprehensive framework for testing generative language models across a wide range of evaluation tasks.
-All the reported benchmark data was already present in the LM-Evaluation-Harness suite.
-_Scores will be available at later stage._
 <!-- **Italian** Data: -->
 <!-- | Task | Accuracy |

 ## Model Evaluation
+For Minerva's evaluation process, we utilized ITA-Bench, a new evaluation suite to test the capabilities of Italian-speaking models.
+ITA-Bench is a collection of 18 benchmarks that assess the performance of language models on various tasks, including scientific knowledge,
+commonsense reasoning, and mathematical problem-solving.
+<div style={{ display: 'flex', justifyContent: 'space-around' }}>
+    <img src="https://huggingface.co/sapienzanlp/Minerva-7B-base-v1.0/resolve/main/Minerva%20LLMs%20Results%20Base%20Models.png" alt="Results on base models" style={{ width: '45%' }}></img>
+    <img src="https://huggingface.co/sapienzanlp/Minerva-7B-base-v1.0/resolve/main/Minerva%20LLMs%20Results%20All%20Base%20Models.png" alt="Results on base models" style={{ width: '45%' }}></img>
+</div>
 <!-- **Italian** Data: -->
 <!-- | Task | Accuracy |