HPAI-BSC
/

Qwen2.5-Aloe-Beta-7B

@@ -36,7 +36,7 @@ Aloe: A Family of Fine-tuned Open Healthcare LLMs
-Qwen2.5-Aloe-Beta-7B is an **open healthcare LLM** achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in four model sizes: [7B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-7B/), [8B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-8B), [70B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-70B), and  [72B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-72B). All models are trained using the same recipe, in top of two different family of models: Llama3.1 and Qwen2.5.
 Aloe is trained on 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
@@ -90,8 +90,7 @@ The Beta model has been developed to excel in several different medical tasks. F
 We also compared the performance of the model in the general domain, using the OpenLLM Leaderboard benchmark. Aloe-Beta gets competitive results with the current SOTA general models in the most used general benchmarks and outperforms the medical models:
-![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/imK19fzyMUvIJaAbSVnGE.png)
 ## Uses
@@ -346,7 +345,7 @@ To compare Aloe with the most competitive open models (both general purpose and
 Benchmark results indicate the training conducted on Aloe has boosted its performance above all other open models within the same model size. Both Qwen2.5-Aloe-Beta-7B and Llama3.1-Aloe-Beta-8B also outperforms other medical models like Llama3-OpenBioLLM and Llama3-Med42. All these results make Aloe-Beta the best healthcare LLM of its size.
-With the help of prompting techniques the performance of Qwen2.5-Aloe-Beta-7B is significantly improved. Medprompting in particular provides a 7% increase in reported accuracy, after which Qwen2.5-Aloe-7B-Beta only lags behind much bigger models like Llama-3.1-70B-Instruct or MedPalm-2. This improvement is mostly consistent across the OpenLLM Leaderboard and the other medical tasks.
 ## Environmental Impact

+Qwen2.5-Aloe-Beta-7B is an **open healthcare LLM** achieving **state-of-the-art performance** on several medical tasks. Aloe Beta is made available in four model sizes: [7B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-7B/), [8B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-8B), [70B](https://huggingface.co/HPAI-BSC/Llama3.1-Aloe-Beta-70B), and  [72B](https://huggingface.co/HPAI-BSC/Qwen2.5-Aloe-Beta-72B). All models are trained using the same recipe, on top of two different families of models: Llama3.1 and Qwen2.5.
 Aloe is trained on 20 medical tasks, resulting in a robust and versatile healthcare model. Evaluations show Aloe models to be among the best in their class. When combined with a RAG system ([also released](https://github.com/HPAI-BSC/prompt_engine)) the 8B version gets close to the performance of closed models like MedPalm-2, GPT4. With the same RAG system, Aloe-Beta-70B outperforms those private alternatives, producing state-of-the-art results.
 We also compared the performance of the model in the general domain, using the OpenLLM Leaderboard benchmark. Aloe-Beta gets competitive results with the current SOTA general models in the most used general benchmarks and outperforms the medical models:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/6620f941eba5274b5c12f83d/qJAD38D8XRogP3vlgFf8z.png)
 ## Uses
 Benchmark results indicate the training conducted on Aloe has boosted its performance above all other open models within the same model size. Both Qwen2.5-Aloe-Beta-7B and Llama3.1-Aloe-Beta-8B also outperforms other medical models like Llama3-OpenBioLLM and Llama3-Med42. All these results make Aloe-Beta the best healthcare LLM of its size.
+With the help of prompting techniques the performance of Qwen2.5-Aloe-Beta-7B is significantly improved. Medprompting in particular provides a 9% increase in reported accuracy, after which Qwen2.5-Aloe-7B-Beta only lags behind much bigger models like Llama-3.1-70B-Instruct or MedPalm-2. This improvement is mostly consistent across the OpenLLM Leaderboard and the other medical tasks.
 ## Environmental Impact