avemio
/

German-RAG-NEMO-12B-ORPO-HESSIAN-AI

Question Answering

Question-Answering

Model card Files Files and versions Community

avemio-digital commited on Jan 10

Commit

4e066ee

·

verified ·

1 Parent(s): 68ee678

Update README.md

Files changed (1) hide show

README.md +12 -0

README.md CHANGED Viewed

@@ -147,6 +147,18 @@ Four evaluation metrics were employed across all subsets: language quality, over
 | relevant_context       | 71.3                                                                           | 69.1                                                                           | **65.5**                                                                                          |           89.5                                 |
 | summarizations         | 73.8                                                                           | 81.6                                                                           | **80.3**                                                                                          |                     86.9                        |
 ## Model Details
 ### Data

 | relevant_context       | 71.3                                                                           | 69.1                                                                           | **65.5**                                                                                          |           89.5                                 |
 | summarizations         | 73.8                                                                           | 81.6                                                                           | **80.3**                                                                                          |                     86.9                        |
+## Hard Benchmark Eval
+<img src="https://avemio.digital/wp-content/uploads/2025/01/GRAG-NEMO-ORPO.png" alt="GRAG Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
+| Metric                  | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | **[GRAG-NEMO-ORPO](https://huggingface.co/avemio/GRAG-NEMO-12B-ORPO-HESSIAN-AI)** | GPT-3.5-TURBO | GPT-4o | GPT-4o-mini |
+|-------------------------|-----------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|----------------|---------|-------------|
+| **OVERALL SCORES (weighted):** |                                                                                 |                                                                                       |                |         |             |
+| hard_reasoning_de       | 43.6                                                                                    | **49.7**                                                                                  | 37.9           | 62.9    | 58.4        |
+| hard_reasoning_en       | 54.2                                                                                    | **55.6**                                                                                  | 48.3           | 61.7    | 62.9        |
 ## Model Details
 ### Data