avemio-digital commited on
Commit
4675057
verified
1 Parent(s): 91146eb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -29
README.md CHANGED
@@ -1,16 +1,16 @@
1
  ---
2
  license: apache-2.0
3
  datasets:
4
- - avemio/German_RAG-CPT-HESSIAN-AI
5
- - avemio/German_RAG-SFT-ShareGPT-HESSIAN-AI
6
- - avemio/German_RAG-ORPO-ShareGPT-HESSIAN-AI
7
  - VAGOsolutions/SauerkrautLM-Fermented-GER-DPO
8
  - VAGOsolutions/SauerkrautLM-Fermented-Irrelevance-GER-DPO
9
  language:
10
  - en
11
  - de
12
  base_model:
13
- - avemio/German_RAG-NEMO-12B-SFT-HESSIAN-AI
14
  pipeline_tag: question-answering
15
  tags:
16
  - German
@@ -22,25 +22,25 @@ tags:
22
  ---
23
 
24
 
25
- <img src="https://www.German_RAG.ai/wp-content/uploads/2024/12/German_RAG-ICON-TO-WORDLOGO-Animation_Loop-small-ezgif.com-video-to-gif-converter.gif" alt="German_RAG Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
26
 
27
 
28
- # German_RAG-NEMO-12B-ORPO-HESSIAN-AI
29
 
30
  <!-- Provide a quick summary of what the model is/does. -->
31
 
32
- **German_RAG** (**G**erman **R**etrieval **A**ugmented **G**eneration) models are designed for the German-speaking market, enabling innovation and AI solutions to drive German research collaboration in business-focused Generative AI by 2025
33
 
34
- Our German_RAG-NEMP-ORPO model are trained on this **[German_RAG-ORPO](https://huggingface.co/datasets/avemio/German_RAG-ORPO-ShareGPT-HESSIAN-AI) dataset.**
35
 
36
  ## Model Details
37
 
38
  The core models released in this batch are the following:
39
  | Size | Training Tokens |
40
  |------|--------|
41
- | [German_RAG-NEMO-CPT](https://huggingface.co/avemio/German_RAG-NEMO-12B-CPT-HESSIAN-AI) | 507.47 million |
42
- | [German_RAG-NEMO-SFT](https://huggingface.co/avemio/German_RAG-NEMO-12B-SFT-HESSIAN-AI) | 2.03 billion |
43
- | [German_RAG-NEMO-ORPO](https://huggingface.co/avemio/German_RAG-NEMO-12B-ORPO-HESSIAN-AI) | 2.0577 billion |
44
  ### Model Description
45
 
46
  <!-- Provide a longer summary of what this model is. -->
@@ -50,19 +50,19 @@ The core models released in this batch are the following:
50
  - **Model type:** a Transformer style autoregressive language model.
51
  - **Language(s) (NLP):** German, English
52
  - **License:** The code and model are released under Apache 2.0.
53
- - **Contact:** [German_RAG@avemio.digital](mailto:German_RAG@avemio.digital)
54
 
55
 
56
  ### Model Sources
57
 
58
  <!-- Provide the basic links for the model. -->
59
 
60
- - **Training Study:** [Training Study](https://avemio.digital/wp-content/uploads/2025/01/German_RAG-TRAINING-STUDY-Advancing-German-Language-AI-with-hessian-AI.pdf)
61
  - **Repositories:**
62
  - Training: [Colab-Notebook](https://colab.research.google.com/drive/18SH_aYLCnw1K7cRGOTTZ80y98V5Kquxb?usp=sharing)
63
  - Evaluation code:
64
- - [German_RAG-LLM-HARD-BENCHMARK](https://github.com/avemio-digital/German_RAG-LLM-HARD-BENCHMARK.git)
65
- - [German_RAG-LLM-EASY-BENCHMARK](https://github.com/avemio-digital/German_RAG-LLM-EASY-BENCHMARK.git)
66
  - **Technical blog post:**
67
  <!-- - **Press release:** TODO -->
68
 
@@ -76,7 +76,7 @@ Now, proceed as usual with HuggingFace:
76
  ```python
77
  from transformers import AutoModelForCausalLM, AutoTokenizer
78
 
79
- model_name = "avemio/German_RAG-NEMO-12B-ORPO-HESSIAN-AI"
80
 
81
  model = AutoModelForCausalLM.from_pretrained(
82
  model_name,
@@ -125,7 +125,7 @@ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
125
  We are providing a comprehensive Google Colab notebook to guide users through the process of fine-tuning our model, complete with detailed instructions, essential dependencies, and configurable settings.
126
  [Colab-Notebook](https://colab.research.google.com/drive/18SH_aYLCnw1K7cRGOTTZ80y98V5Kquxb?usp=sharing).
127
 
128
- ## German_RAG-LLM-EASY-BENCHMARK EVAL
129
 
130
  <!-- This section describes the evaluation protocols and provides the results. -->
131
  The evaluation was performed using seven subsets, focusing on extraction recall, question answering (QA) with multiple references, and time difference reasoning. Relevant context and summarization were treated as distinct subsets, each playing a crucial role in the evaluation process. For relevant context, the model's ability to identify and extract pertinent information from the source material was assessed. In contrast, the summarization subset evaluated the model's capability to generate concise and accurate summaries based on the relevant context.
@@ -138,7 +138,7 @@ Four evaluation metrics were employed across all subsets: language quality, over
138
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
139
 
140
 
141
- | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | [German_RAG-NEMO-SFT](https://huggingface.co/avemio/German_RAG-NEMO-12B-SFT-HESSIAN-AI) | **[German_RAG-NEMO-ORPO](https://huggingface.co/avemio/German_RAG-NEMO-12B-ORPO-HESSIAN-AI)** | GPT-3.5-TURBO |
142
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|----------------|
143
  | Average Language Quality | 85.88 | 89.61 | **89.1** | 91.86 |
144
  | **OVERALL SCORES (weighted):** | | | | |
@@ -150,11 +150,11 @@ Four evaluation metrics were employed across all subsets: language quality, over
150
  | summarizations | 73.8 | 81.6 | **80.3** | 86.9 |
151
 
152
 
153
- ## German_RAG-LLM-HARD-BENCHMARK EVAL
154
 
155
- <img src="https://avemio.digital/wp-content/uploads/2025/01/German_RAG-NEMO-ORPO.png" alt="German_RAG Logo" width="600" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
156
 
157
- | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | **[German_RAG-NEMO-ORPO](https://huggingface.co/avemio/German_RAG-NEMO-12B-ORPO-HESSIAN-AI)** | GPT-3.5-TURBO | GPT-4o | GPT-4o-mini |
158
  |-------------------------|-----------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|----------------|---------|-------------|
159
  | **OVERALL SCORES (weighted):** | | | | | |
160
  | hard_reasoning_de | 43.6 | **49.7** | 37.9 | 62.9 | 58.4 |
@@ -164,7 +164,7 @@ Four evaluation metrics were employed across all subsets: language quality, over
164
  ## Model Details
165
 
166
  ### Data
167
- For training data details, please see the [German_RAG-ORPO-Dataset](https://huggingface.co/datasets/avemio/German_RAG-ORPO-ShareGPT-HESSIAN-AI) documentation.
168
 
169
  The ORPO Tasks Dataset represents a specialized collection for fine-tuning language models with a focus on RAG-specific capabilities.
170
 
@@ -217,7 +217,7 @@ The subsets can be for this training step are derived from 3 different sources:
217
  ### Architecture
218
 
219
 
220
- | Parameter | German_RAG-NEMO-ORPO |
221
  |-----------------------|-----------------------------------------------------------------------------------------------|
222
  | **d_model** | 5120 |
223
  | **num heads** | 32 |
@@ -235,7 +235,7 @@ The subsets can be for this training step are derived from 3 different sources:
235
  ### Hyperparameters
236
 
237
 
238
- | Parameter | German_RAG-NEMO-ORPO |
239
  |---------------------------|--------------------|
240
  | **warmup steps** | 50 |
241
  | **peak LR** | 5.0E-07 |
@@ -246,19 +246,19 @@ The subsets can be for this training step are derived from 3 different sources:
246
 
247
  ## Environmental Impact
248
 
249
- German_RAG-NEMO-ORPO, running on NVIDIA A100 with 80 GPUs for 4 days, has an approximate power consumption as follows:
250
 
251
  It's important to note that the actual power consumption may vary depending on the specific workload and operational conditions. For accurate power consumption measurements, using dedicated power monitoring tools is recommended.
252
 
253
  | Model | GPU Type | Power Consumption From GPUs |
254
  |----------------|---------------------|-----------------------------|
255
- | German_RAG-NEMO-ORPO | A100 ([Hessian AI supercomputer](https://hessian.ai/de/)) | 0.01843 MWh |
256
  ## Bias, Risks, and Limitations
257
 
258
  Like any base language model or fine-tuned model without safety filtering, it is relatively easy for a user to prompt these models to generate harmful and generally sensitive content.
259
  Such content can also be produced unintentionally, especially in the case of bias, so we recommend users consider the risks of applications of this technology.
260
 
261
- Otherwise, many facts from German_RAG-NEMO-ORPO or any LLM will often not be true, so they should be checked.
262
 
263
 
264
 
@@ -266,9 +266,9 @@ Otherwise, many facts from German_RAG-NEMO-ORPO or any LLM will often not be tru
266
  ## Model Card Contact
267
 
268
 
269
- For errors in this model card, please contact ([German_RAG@avemio.digital](mailto:German_RAG@avemio.digital)).
270
 
271
- ## The German_RAG AI Team
272
  [Marcel Rosiak](https://de.linkedin.com/in/marcel-rosiak)
273
  [Soumya Paul](https://de.linkedin.com/in/soumya-paul-1636a68a)
274
  [Siavash Mollaebrahim](https://de.linkedin.com/in/siavash-mollaebrahim-4084b5153?trk=people-guest_people_search-card)
 
1
  ---
2
  license: apache-2.0
3
  datasets:
4
+ - avemio/German-RAG-CPT-HESSIAN-AI
5
+ - avemio/German-RAG-SFT-ShareGPT-HESSIAN-AI
6
+ - avemio/German-RAG-ORPO-ShareGPT-HESSIAN-AI
7
  - VAGOsolutions/SauerkrautLM-Fermented-GER-DPO
8
  - VAGOsolutions/SauerkrautLM-Fermented-Irrelevance-GER-DPO
9
  language:
10
  - en
11
  - de
12
  base_model:
13
+ - avemio/German-RAG-NEMO-12B-SFT-HESSIAN-AI
14
  pipeline_tag: question-answering
15
  tags:
16
  - German
 
22
  ---
23
 
24
 
25
+ <img src="https://www.German-RAG.ai/wp-content/uploads/2024/12/German-RAG-ICON-TO-WORDLOGO-Animation_Loop-small-ezgif.com-video-to-gif-converter.gif" alt="German-RAG Logo" width="400" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
26
 
27
 
28
+ # German-RAG-NEMO-12B-ORPO-HESSIAN-AI
29
 
30
  <!-- Provide a quick summary of what the model is/does. -->
31
 
32
+ **German-RAG** (**G**erman **R**etrieval **A**ugmented **G**eneration) models are designed for the German-speaking market, enabling innovation and AI solutions to drive German research collaboration in business-focused Generative AI by 2025
33
 
34
+ Our German-RAG-NEMP-ORPO model are trained on this **[German-RAG-ORPO](https://huggingface.co/datasets/avemio/German-RAG-ORPO-ShareGPT-HESSIAN-AI) dataset.**
35
 
36
  ## Model Details
37
 
38
  The core models released in this batch are the following:
39
  | Size | Training Tokens |
40
  |------|--------|
41
+ | [German-RAG-NEMO-CPT](https://huggingface.co/avemio/German-RAG-NEMO-12B-CPT-HESSIAN-AI) | 507.47 million |
42
+ | [German-RAG-NEMO-SFT](https://huggingface.co/avemio/German-RAG-NEMO-12B-SFT-HESSIAN-AI) | 2.03 billion |
43
+ | [German-RAG-NEMO-ORPO](https://huggingface.co/avemio/German-RAG-NEMO-12B-ORPO-HESSIAN-AI) | 2.0577 billion |
44
  ### Model Description
45
 
46
  <!-- Provide a longer summary of what this model is. -->
 
50
  - **Model type:** a Transformer style autoregressive language model.
51
  - **Language(s) (NLP):** German, English
52
  - **License:** The code and model are released under Apache 2.0.
53
+ - **Contact:** [German-RAG@avemio.digital](mailto:German-RAG@avemio.digital)
54
 
55
 
56
  ### Model Sources
57
 
58
  <!-- Provide the basic links for the model. -->
59
 
60
+ - **Training Study:** [Training Study](https://avemio.digital/wp-content/uploads/2025/01/German-RAG-TRAINING-STUDY-Advancing-German-Language-AI-with-hessian-AI.pdf)
61
  - **Repositories:**
62
  - Training: [Colab-Notebook](https://colab.research.google.com/drive/18SH_aYLCnw1K7cRGOTTZ80y98V5Kquxb?usp=sharing)
63
  - Evaluation code:
64
+ - [German-RAG-LLM-HARD-BENCHMARK](https://github.com/avemio-digital/German-RAG-LLM-HARD-BENCHMARK.git)
65
+ - [German-RAG-LLM-EASY-BENCHMARK](https://github.com/avemio-digital/German-RAG-LLM-EASY-BENCHMARK.git)
66
  - **Technical blog post:**
67
  <!-- - **Press release:** TODO -->
68
 
 
76
  ```python
77
  from transformers import AutoModelForCausalLM, AutoTokenizer
78
 
79
+ model_name = "avemio/German-RAG-NEMO-12B-ORPO-HESSIAN-AI"
80
 
81
  model = AutoModelForCausalLM.from_pretrained(
82
  model_name,
 
125
  We are providing a comprehensive Google Colab notebook to guide users through the process of fine-tuning our model, complete with detailed instructions, essential dependencies, and configurable settings.
126
  [Colab-Notebook](https://colab.research.google.com/drive/18SH_aYLCnw1K7cRGOTTZ80y98V5Kquxb?usp=sharing).
127
 
128
+ ## German-RAG-LLM-EASY-BENCHMARK EVAL
129
 
130
  <!-- This section describes the evaluation protocols and provides the results. -->
131
  The evaluation was performed using seven subsets, focusing on extraction recall, question answering (QA) with multiple references, and time difference reasoning. Relevant context and summarization were treated as distinct subsets, each playing a crucial role in the evaluation process. For relevant context, the model's ability to identify and extract pertinent information from the source material was assessed. In contrast, the summarization subset evaluated the model's capability to generate concise and accurate summaries based on the relevant context.
 
138
  - **Overall score:** This metric combined the results from the previous three metrics, offering a comprehensive evaluation of the model's capabilities across all subsets.
139
 
140
 
141
+ | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | [German-RAG-NEMO-SFT](https://huggingface.co/avemio/German-RAG-NEMO-12B-SFT-HESSIAN-AI) | **[German-RAG-NEMO-ORPO](https://huggingface.co/avemio/German-RAG-NEMO-12B-ORPO-HESSIAN-AI)** | GPT-3.5-TURBO |
142
  |------------------------------------------|---------------------------------------------------------------------------------|--------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------|----------------|
143
  | Average Language Quality | 85.88 | 89.61 | **89.1** | 91.86 |
144
  | **OVERALL SCORES (weighted):** | | | | |
 
150
  | summarizations | 73.8 | 81.6 | **80.3** | 86.9 |
151
 
152
 
153
+ ## German-RAG-LLM-HARD-BENCHMARK EVAL
154
 
155
+ <img src="https://avemio.digital/wp-content/uploads/2025/01/German-RAG-NEMO-ORPO.png" alt="German-RAG Logo" width="600" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
156
 
157
+ | Metric | [Vanila-Mistral-Nemo-Instruct-2407](https://huggingface.co/mistralai/Mistral-Nemo-Instruct-2407) | **[German-RAG-NEMO-ORPO](https://huggingface.co/avemio/German-RAG-NEMO-12B-ORPO-HESSIAN-AI)** | GPT-3.5-TURBO | GPT-4o | GPT-4o-mini |
158
  |-------------------------|-----------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------|----------------|---------|-------------|
159
  | **OVERALL SCORES (weighted):** | | | | | |
160
  | hard_reasoning_de | 43.6 | **49.7** | 37.9 | 62.9 | 58.4 |
 
164
  ## Model Details
165
 
166
  ### Data
167
+ For training data details, please see the [German-RAG-ORPO-Dataset](https://huggingface.co/datasets/avemio/German-RAG-ORPO-ShareGPT-HESSIAN-AI) documentation.
168
 
169
  The ORPO Tasks Dataset represents a specialized collection for fine-tuning language models with a focus on RAG-specific capabilities.
170
 
 
217
  ### Architecture
218
 
219
 
220
+ | Parameter | German-RAG-NEMO-ORPO |
221
  |-----------------------|-----------------------------------------------------------------------------------------------|
222
  | **d_model** | 5120 |
223
  | **num heads** | 32 |
 
235
  ### Hyperparameters
236
 
237
 
238
+ | Parameter | German-RAG-NEMO-ORPO |
239
  |---------------------------|--------------------|
240
  | **warmup steps** | 50 |
241
  | **peak LR** | 5.0E-07 |
 
246
 
247
  ## Environmental Impact
248
 
249
+ German-RAG-NEMO-ORPO, running on NVIDIA A100 with 80 GPUs for 4 days, has an approximate power consumption as follows:
250
 
251
  It's important to note that the actual power consumption may vary depending on the specific workload and operational conditions. For accurate power consumption measurements, using dedicated power monitoring tools is recommended.
252
 
253
  | Model | GPU Type | Power Consumption From GPUs |
254
  |----------------|---------------------|-----------------------------|
255
+ | German-RAG-NEMO-ORPO | A100 ([Hessian AI supercomputer](https://hessian.ai/de/)) | 0.01843 MWh |
256
  ## Bias, Risks, and Limitations
257
 
258
  Like any base language model or fine-tuned model without safety filtering, it is relatively easy for a user to prompt these models to generate harmful and generally sensitive content.
259
  Such content can also be produced unintentionally, especially in the case of bias, so we recommend users consider the risks of applications of this technology.
260
 
261
+ Otherwise, many facts from German-RAG-NEMO-ORPO or any LLM will often not be true, so they should be checked.
262
 
263
 
264
 
 
266
  ## Model Card Contact
267
 
268
 
269
+ For errors in this model card, please contact ([German-RAG@avemio.digital](mailto:German-RAG@avemio.digital)).
270
 
271
+ ## The German-RAG AI Team
272
  [Marcel Rosiak](https://de.linkedin.com/in/marcel-rosiak)
273
  [Soumya Paul](https://de.linkedin.com/in/soumya-paul-1636a68a)
274
  [Siavash Mollaebrahim](https://de.linkedin.com/in/siavash-mollaebrahim-4084b5153?trk=people-guest_people_search-card)