JordiBayarriUPC commited on
Commit
0977aee
2 Parent(s): a9666ae 5888f4d

Merge branch 'main' of https://huggingface.co/HPAI-BSC/Qwen1.5-Aloe-Beta-72B add readme

Browse files
Files changed (1) hide show
  1. README.md +390 -0
README.md ADDED
@@ -0,0 +1,390 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ datasets:
4
+ - HPAI-BSC/Aloe-Beta-General-Collection
5
+ - HPAI-BSC/chain-of-diagnosis
6
+ - HPAI-BSC/MedS-Ins
7
+ - HPAI-BSC/ultramedical
8
+ - HPAI-BSC/pubmedqa-cot-llama31
9
+ - HPAI-BSC/medqa-cot-llama31
10
+ - HPAI-BSC/medmcqa-cot-llama31
11
+ - HPAI-BSC/headqa-cot-llama31
12
+ - HPAI-BSC/MMLU-medical-cot-llama31
13
+ - HPAI-BSC/Polymed-QA
14
+ - HPAI-BSC/Aloe-Beta-General-Collection
15
+ - HPAI-BSC/Aloe-Beta-General-Collection
16
+ language:
17
+ - en
18
+ library_name: transformers
19
+ tags:
20
+ - biology
21
+ - medical
22
+ - healthcare
23
+ pipeline_tag: question-answering
24
+ ---
25
+ <p align="center">
26
+ <picture>
27
+ <source media="(prefers-color-scheme: dark)" srcset="https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/3_lyx8rP6VuhXN8YRaZDS.png">
28
+ <img alt="aloe_beta_7b" src="https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/3_lyx8rP6VuhXN8YRaZDS.png" width=50%>
29
+ </picture>
30
+ </p>
31
+ <h1 align="center">
32
+ Aloe: A Family of Fine-tuned Open Healthcare LLMs
33
+ </h1>
34
+
35
+ ---
36
+
37
+
38
+
39
+ Qwen2.5-Aloe-Beta-72B is an **open healthcare LLM** achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in four model sizes: [7B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-7B/), [8B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-8B), [70B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-70B), and [72B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-72B). All models are trained using the same recipe, on top of two different families of models: Llama3.1 and Qwen2.5.
40
+
41
+ Aloe is trained on 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
42
+
43
+
44
+ # Aloe-Beta-72B
45
+
46
+
47
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f7a16192950415b637e201/VUYw4IdANKGrH2VOedwH0.png)
48
+
49
+ **Aloe-Beta** is the latest iteration in the **Aloe family**, building and improving on the success of its predecessor, [Aloe-8B-Alpha](https://huggingface.co/HPAI-BSC/Llama3-Aloe-8B-Alpha).
50
+ Beta more than triples the training data used by Alpha, for a total of **1.8B tokens**, including a wider variety of medical tasks and instructions (e.g., text summarization, explanation, diagnosis, text classification, treatment recommendation, ...).
51
+
52
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62f7a16192950415b637e201/bCuV5kZUT9H9UECAOWDRc.png)
53
+
54
+ To mitigate catastrophic forgetting and enable the model to effectively learn new capabilities like **function calling**, we incorporated a diverse set of high-quality general-purpose data constituting 20% of the total training set. The curated data includes some of the highest-quality content available across a range of topics, including mathematics, programming, STEM, and very long instructions (> 8k tokens), to enrich the model's adaptability and comprehension across diverse domains.
55
+
56
+ Beta also boosts the alignment and safety stages with respect to Alpha. This includes a [medical preference dataset](https://huggingface.co/datasets/TsinghuaC3I/UltraMedical-Preference), as well as the red-teaming dataset (available soon).
57
+
58
+ Complete training details, model merging configurations, and all training data (including synthetically generated data) can be found below. This includes [the RAG system](https://github.com/HPAI-BSC/prompt_engine) that was developed to test Aloe Beta in a deployment setup. Aloe comes with a healthcare-specific risk assessment to facilitate to the safe use and deployment of such systems.
59
+
60
+
61
+ ## Model Details
62
+
63
+ ### [](https://huggingface.co/templates/model-card-example#model-description)Model Description
64
+
65
+ - **Developed by:**聽[HPAI](https://hpai.bsc.es/)
66
+ - **Model type:**聽Causal decoder-only transformer language model
67
+ - **Language(s) (NLP):**聽English (capable but not formally evaluated on other languages)
68
+ - **License:**聽This model is based on [Qwen2.5-72B](https://huggingface.co/Qwen/Qwen2.5-72B) which is released with Apache 2.0 license. All our modifications are available with a [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/) license, making the Aloe Beta models **compatible with commercial use**.
69
+ - **Base model :** [Qwen2.5-72B](https://huggingface.co/Qwen/Qwen2.5-72B)
70
+ - **Paper:** (more coming soon)
71
+ - **RAG Repository:**聽https://github.com/HPAI-BSC/prompt_engine
72
+
73
+ ### [](https://huggingface.co/templates/model-card-example#model-sources-optional)Model Sources [optional]
74
+
75
+ ## Model Performance
76
+
77
+ Aloe Beta has been tested on the most popular healthcare QA datasets, with and without Medprompt inference technique. Results show competitive performance, achieving SOTA within models of the same size.
78
+
79
+
80
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/J-PoCeKPRTPFb8wtQCQ07.png)
81
+
82
+ The Beta model has been developed to excel in several different medical tasks. For this reason, we evaluated the model in many different medical tasks:
83
+
84
+
85
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/3wj-aWXnzxR4XNLg9Ffii.png)
86
+
87
+
88
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/W65-vEbPFH2kMl5Jav7hU.png)
89
+
90
+ We also compared the performance of the model in the general domain, using the OpenLLM Leaderboard benchmark. Aloe-Beta gets competitive results with the current SOTA general models in the most used general benchmarks and outperforms the medical models:
91
+
92
+
93
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/qJAD38D8XRogP3vlgFf8z.png)
94
+
95
+ ## Uses
96
+
97
+ ### Direct Use
98
+
99
+ We encourage the use of Aloe for research purposes, as a stepping stone to build better foundational models for healthcare. In production, Aloe should always be used under the supervision of a human expert.
100
+
101
+ ### Out-of-Scope Use
102
+
103
+ These models are not to be used for clinical practice, medical diagnosis, or any other form of direct or indirect healthcare advice. Models are prone to error and can produce toxic content. The use of Aloe models for activities harmful to individuals, such as spam, fraud, or impersonation, is strictly prohibited. Minors should not be left alone to interact with Aloe without supervision.
104
+
105
+ ## Bias, Risks, and Limitations
106
+
107
+ Aloe can produce toxic content under the appropriate prompts, and it includes multiple undesirable biases. While significant efforts where conducted to mitigate this (see Alignment details below), model safety cannot be fully guaranteed. We avoid the use of all personal data in our training.
108
+
109
+ We identify at least three risk cases specific to healthcare LLMs:
110
+ - Healthcare professional impersonation, a fraudulent behaviour which currently generates billions of dollars in [profit](https://www.justice.gov/opa/pr/justice-department-charges-dozens-12-billion-health-care-fraud). A model such as Aloe could be used to increase the efficacy of such deceiving activities, making them more widespread. The main preventive actions are public literacy on the unreliability of digitised information and the importance of medical registration, and legislation enforcing AI-generated content disclaimers.
111
+ - Medical decision-making without professional supervision. While this is already an issue in modern societies (eg self-medication) a model such as Aloe, capable of producing high-quality conversational data, can facilitate self-delusion, particularly in the presence of sycophancy. By producing tailored responses, it can also be used to generate actionable answers. Public literacy on the dangers of self-diagnosis is one of the main defenses, together with the introduction of disclaimers and warnings on the models' outputs.
112
+ - Access to information on dangerous substances or procedures. While the literature on sensitive content can already be found on different sources (eg libraries, the internet, dark web), LLMs can centralize such access, making it nearly impossible to control the flow of such information. Model alignment can help in that regard, but so far the effects remain insufficient, as jailbreaking methods still overcome it.
113
+
114
+
115
+ <!---
116
+ Table below shows the performance of Aloe at several AI safety tasks:
117
+
118
+ TO BE UPDATED
119
+
120
+ <img src="https://cdn-uploads.huggingface.co/production/uploads/62972c4979f193515da1d38e/T6Jblpf1kmTkM04K716rM.png" width="95%">
121
+
122
+
123
+ We analyzed the safety and robustness of the model using red teaming techniques. We designed a benchmark using different types of attacks and analyzed the performance of Aloe and some extra models, and we confirm that our model is aligned properly and successfully resisting most attacks:
124
+
125
+
126
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/KS3yrHan1l1W0cYiXGG-G.png)
127
+
128
+
129
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/SYC0qljpLGLmMgx0a623W.png)
130
+
131
+ -->
132
+
133
+ ## How to Get Started with the Model
134
+
135
+ Use the code below to get started with the model. You can run conversational inference using the Transformers pipeline abstraction, or by leveraging the Auto classes with the `generate()` function. Let's see examples for both.
136
+
137
+ #### Transformers pipeline
138
+
139
+ ```python
140
+ import transformers
141
+ import torch
142
+
143
+ model_id = "HPAI-BSC/Qwen2.5-Aloe-Beta-7B"
144
+
145
+ pipeline = transformers.pipeline(
146
+ "text-generation",
147
+ model=model_id,
148
+ model_kwargs={"torch_dtype": torch.bfloat16},
149
+ device_map="auto",
150
+ )
151
+
152
+ messages = [
153
+ {"role": "system", "content": "You are an expert medical assistant named Aloe, developed by the High Performance Artificial Intelligence Group at Barcelona Supercomputing Center(BSC). You are to be a helpful, respectful, and honest assistant."},
154
+ {"role": "user", "content": "Hello."},
155
+ ]
156
+
157
+ prompt = pipeline.tokenizer.apply_chat_template(
158
+ messages,
159
+ tokenize=False,
160
+ add_generation_prompt=True
161
+ )
162
+
163
+ terminators = [
164
+ pipeline.tokenizer.eos_token_id,
165
+ pipeline.tokenizer.convert_tokens_to_ids("<|eot_id|>")
166
+ ]
167
+
168
+ outputs = pipeline(
169
+ prompt,
170
+ max_new_tokens=256,
171
+ eos_token_id=terminators,
172
+ do_sample=True,
173
+ temperature=0.6,
174
+ top_p=0.9,
175
+ )
176
+ print(outputs[0]["generated_text"][len(prompt):])
177
+ ```
178
+
179
+ #### Transformers AutoModelForCausalLM
180
+
181
+ ```python
182
+ from transformers import AutoTokenizer, AutoModelForCausalLM
183
+ import torch
184
+
185
+ model_id = "HPAI-BSC/Qwen2.5-Aloe-Beta-7B"
186
+
187
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
188
+ model = AutoModelForCausalLM.from_pretrained(
189
+ model_id,
190
+ torch_dtype=torch.bfloat16,
191
+ device_map="auto",
192
+ )
193
+
194
+ messages = [
195
+ {"role": "system", "content": "You are an expert medical assistant named Aloe, developed by the High Performance Artificial Intelligence Group at Barcelona Supercomputing Center(BSC). You are to be a helpful, respectful, and honest assistant."},
196
+ {"role": "user", "content": "Hello"},
197
+ ]
198
+
199
+ input_ids = tokenizer.apply_chat_template(
200
+ messages,
201
+ add_generation_prompt=True,
202
+ return_tensors="pt"
203
+ ).to(model.device)
204
+
205
+ terminators = [
206
+ tokenizer.eos_token_id,
207
+ tokenizer.convert_tokens_to_ids("<|eot_id|>")
208
+ ]
209
+
210
+ outputs = model.generate(
211
+ input_ids,
212
+ max_new_tokens=256,
213
+ eos_token_id=terminators,
214
+ do_sample=True,
215
+ temperature=0.6,
216
+ top_p=0.9,
217
+ )
218
+ response = outputs[0][input_ids.shape[-1]:]
219
+ print(tokenizer.decode(response, skip_special_tokens=True))
220
+ ```
221
+
222
+ ## Training Details
223
+
224
+ ### Supervised fine-tuning
225
+ SFT on top of Qwen2.5-7B using axolotl (https://github.com/axolotl-ai-cloud/axolotl).
226
+
227
+ We used Deepspeed's Zero-3 distributed training using the following hardware:
228
+
229
+ * 7B: 32x NVIDIA Hopper H100 64GB of the *Marenostrum 5*.
230
+ * 8B: 32x NVIDIA Hopper H100 64GB of the *Marenostrum 5*.
231
+ * 70B: 64x NVIDIA Hopper H100 64GB of the *Marenostrum 5*.
232
+ * 72B: 92x NVIDIA Hopper H100 64GB of the *Marenostrum 5*.
233
+
234
+
235
+ <!---
236
+ ^^^ TO BE COMPLETED AND DETAILED ^^^
237
+ -->
238
+
239
+
240
+
241
+ #### Training Data
242
+
243
+ The training set consists of around 1.8B tokens, having 3 different types of data:
244
+
245
+ - Medical domain datasets. Includes data from 20 different medical tasks.
246
+ - [HPAI-BSC/Aloe-Beta-General-Collection](https://huggingface.co/datasets/HPAI-BSC/Aloe-Beta-General-Collection)
247
+ - [HPAI-BSC/chain-of-diagnosis](https://huggingface.co/datasets/HPAI-BSC/chain-of-diagnosis)
248
+ - [HPAI-BSC/MedS-Ins](https://huggingface.co/datasets/HPAI-BSC/MedS-Ins)
249
+ - [HPAI-BSC/ultramedica](https://huggingface.co/datasets/HPAI-BSC/ultramedical)
250
+ - Synthetic data. We expanded our training data by generating high-quality answers using Llama3.1-70B.
251
+ - [HPAI-BSC/pubmedqa-cot-llama31](https://huggingface.co/datasets/HPAI-BSC/pubmedqa-cot-llama31)
252
+ - [HPAI-BSC/medqa-cot-llama31](https://huggingface.co/datasets/HPAI-BSC/medqa-cot-llama31)
253
+ - [HPAI-BSC/medmcqa-cot-llama31](https://huggingface.co/datasets/HPAI-BSC/medmcqa-cot-llama31)
254
+ - [HPAI-BSC/headqa-cot-llama31](https://huggingface.co/datasets/HPAI-BSC/headqa-cot-llama31)
255
+ - [HPAI-BSC/MMLU-medical-cot-llama31](https://huggingface.co/datasets/HPAI-BSC/MMLU-medical-cot-llama31)
256
+ - [HPAI-BSC/Polymed-QA](https://huggingface.co/datasets/HPAI-BSC/Polymed-QA)
257
+ - Genstruct data (coming soon)
258
+ - General data. It includes maths, STEM, code, function calling, and instructions with a very long context.
259
+ - [HPAI-BSC/Aloe-Beta-General-Collection](https://huggingface.co/datasets/HPAI-BSC/Aloe-Beta-General-Collection)
260
+
261
+ #### Training parameters
262
+ - Epochs: 3
263
+ - Sequence length: 16384
264
+ - Optimizer: adamw_torch
265
+ - Learning rate: 1e-5
266
+ - Learning rate scheduler: cosine
267
+ - Warmup steps: 100
268
+ - Weight decay: 0
269
+ - Gradient checkpointing
270
+ - Zero 3
271
+ - Total batch size: 128
272
+ - Batch size per device: 1
273
+ - Gradient accumulation steps: 4
274
+
275
+ ### Model Merging
276
+ The model trained was merged with the Qwen2.5-7B-Instruct model using the DARE_TIES technique. [Mergekit](https://github.com/arcee-ai/mergekit) was used to conduct the merging.
277
+
278
+ ### Model Alignment
279
+ The model is aligned using the Direct Preference Optimization (DPO) technique through a two-step process:
280
+
281
+ 1. General DPO Alignment: This step uses a dataset combining medical, general preference, and safety data. We used our dataset [HPAI-BSC/Aloe-Beta-DPO](https://huggingface.co/datasets/HPAI-BSC/Aloe-Beta-DPO). We split the dataset into five parts, and the model was trained iteratively for one epoch on each chunk. We used a learning rate of 2e-7.
282
+ 2. Red-Teaming Alignment: This step further fine-tunes the model to resist a variety of potential attacks, enhancing its robustness and security. Dataset will be shared soon. In this stage, we set the learning rate to 1e-7.
283
+
284
+ <!---
285
+ ^^^ LINKS TO DPO DATA (DPO added, missing the RT^^^
286
+ -->
287
+
288
+
289
+ We used [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) library. We aligned the model using 16x NVIDA HOOPER H100 64GB of the *Marenostrum 5*. Common hyperparameters:
290
+
291
+ - Sequence length: 4096
292
+ - Optimizer: Fused adam
293
+ - Total batch size 128
294
+ - Batch size per device: 1
295
+ - Gradient accumulation steps: 8
296
+ - Beta: 0.1
297
+
298
+
299
+
300
+ ## Evaluation
301
+
302
+ ### Testing Data, Factors & Metrics
303
+
304
+ #### Testing Data
305
+
306
+
307
+ - [ACI-BENCH](https://github.com/wyim/aci-bench)
308
+ - [MTS-Dialog](https://github.com/abachaa/MTS-Dialog)
309
+ - [MedText](https://huggingface.co/datasets/BI55/MedText)
310
+ - [Medical Text classification](https://www.kaggle.com/datasets/chaitanyakck/medical-text/data)
311
+ - [OLAPH](https://github.com/dmis-lab/OLAPH)
312
+ - CareQA Open
313
+ - [MedDialog](https://huggingface.co/datasets/bigbio/meddialog)
314
+ - [MEDIQA QA](https://huggingface.co/datasets/bigbio/mediqa_qa)
315
+ - [Meddialog Qsumm](https://huggingface.co/datasets/lighteval/med_dialog)
316
+ - [Biored](https://huggingface.co/datasets/YufeiHFUT/BioRED_all_info)
317
+ - [MIMIC-III](https://huggingface.co/datasets/dmacres/mimiciii-hospitalcourse-meta)
318
+ - [Medical Prescription](https://huggingface.co/datasets/devlocalhost/prescription-full)
319
+ - [MedQA (USMLE)](https://huggingface.co/datasets/bigbio/med_qa)
320
+ - [MedMCQA](https://huggingface.co/datasets/medmcqa)
321
+ - [PubMedQA](https://huggingface.co/datasets/bigbio/pubmed_qa)
322
+ - [MMLU-Medical](https://huggingface.co/datasets/lukaemon/mmlu)
323
+ - [MedQA-4-Option](https://huggingface.co/datasets/GBaker/MedQA-USMLE-4-options)
324
+ - [CareQA](https://huggingface.co/datasets/HPAI-BSC/CareQA)
325
+ - [Open LLM Leaderboard 2](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
326
+
327
+ <!---
328
+ ^^^ CAREQA Open link MISSING ^^^
329
+ -->
330
+
331
+ #### Metrics
332
+
333
+ - Accuracy: suite the evaluation of multiple-choice question-answering tasks.
334
+ - Rouge1: refers to the overlap of unigrams between the system and the gold standard.
335
+
336
+
337
+ <!---
338
+ ^^^ MORE METRICS MISSING ^^^
339
+ -->
340
+
341
+ #### Summary
342
+
343
+ To compare Aloe with the most competitive open models (both general purpose and healthcare-specific) we use popular healthcare datasets (PubMedQA, MedMCQA, MedQA and MMLU for six medical tasks only), together with the new and highly reliable CareQA. However, while MCQA benchmarks provide valuable insights into a model's ability to handle structured queries, they fall short of representing the full range of challenges faced in medical practice. Building upon this idea, Aloe-Beta represents the next step in the evolution of the Aloe Family, designed to broaden the scope beyond the multiple-choice question-answering tasks that define Aloe-Alpha.
344
+
345
+ Benchmark results indicate the training conducted on Aloe has boosted its performance achieving comparable results with SOTA models like Llama3-OpenBioLLLM, Llama3-Med42, MedPalm-2 and GPT-4. Llama3.1-Aloe-Beta-70B also outperforms the other existing medical models in the OpenLLM Leaderboard and in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others. All these results make Llama3.1-Aloe-Beta-70B one of the best existing models for healthcare.
346
+
347
+
348
+ Benchmark results indicate the training conducted on Qwen2.5-Aloe-Beta-72B has boosted its performance, outperforming all the existing public and private models in the medical MCQA benchmarks. In addition, the model is outperforming in the evaluation of other medical tasks like Medical Factualy and Medical Treatment recommendations among others.
349
+
350
+ With the help of prompting techniques the performance of Aloe is significantly improved. Medprompting in particular provides a 4% increase in reported accuracy, after which Qwen2.5-Aloe-Beta-72B outperforms all the existing models that do not use RAG evaluation.
351
+
352
+
353
+ ## Environmental Impact
354
+
355
+ - **Hardware Type:**聽32xH100
356
+ - **Hours used (8B):**聽544 GPU hours
357
+ - **Hours used (70B):**聽4500 GPU hours
358
+ - **Hardware Provider:**聽Barcelona Supercomputing Center (BSC)
359
+ - **Compute Region:**聽Spain
360
+ - **Carbon Emitted:**聽34.1 kg of CO2
361
+
362
+ <!---
363
+ ^^^ ARE CARBON EMISSIONS FOR BOTH? ^^^
364
+ -->
365
+
366
+
367
+ ## Authors
368
+ Aloe Beta has been developed by the [High Performance Artificial Intelligence](https://hpai.bsc.es/) research group, from the [Barcelona Supercomping Center - BSC](https://www.bsc.es/). Main authors are [Jordi Bayarri Planas](https://huggingface.co/JordiBayarri), [Ashwin Kumar Gururajan](https://huggingface.co/G-AshwinKumar) and [Dario Garcia-Gasulla](https://huggingface.co/dariog). Red teaming efforts lead by Adrian Tormos.
369
+
370
371
+
372
+ ## Citations
373
+
374
+
375
+ <!---
376
+ Add the prompt engine paper below
377
+ -->
378
+
379
+ If you use this repository in a published work, please cite the corresponding papers as source:
380
+
381
+ ```
382
+ @misc{gururajan2024aloe,
383
+ title={Aloe: A Family of Fine-tuned Open Healthcare LLMs},
384
+ author={Ashwin Kumar Gururajan and Enrique Lopez-Cuena and Jordi Bayarri-Planas and Adrian Tormos and Daniel Hinjos and Pablo Bernabeu-Perez and Anna Arias-Duart and Pablo Agustin Martin-Torres and Lucia Urcelay-Ganzabal and Marta Gonzalez-Mallo and Sergio Alvarez-Napagao and Eduard Ayguad茅-Parra and Ulises Cort茅s Dario Garcia-Gasulla},
385
+ year={2024},
386
+ eprint={2405.01886},
387
+ archivePrefix={arXiv},
388
+ primaryClass={cs.CL}
389
+ }
390
+ ```