File size: 5,745 Bytes
dddba41
 
 
 
 
1dd6ba0
2b0cc6c
1dd6ba0
2b0cc6c
b3ebd01
2b0cc6c
b3ebd01
2b0cc6c
b3ebd01
2b0cc6c
 
 
 
 
 
 
7d50dfe
 
 
 
1dd6ba0
7d50dfe
 
 
 
 
 
 
 
 
 
 
 
2b0cc6c
 
e332435
d56776b
 
0e6a590
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
---
language:
- en
library_name: transformers
---
Introducing **JiviMed-8B_v1**: The Cutting-Edge Biomedical Language Model

JiviMed-8B stands as a pinnacle in language modeling tailored specifically for the biomedical sector. Developed by Jivi AI , this model incorporates the latest advancements to deliver unparalleled performance across a wide range of biomedical applications.

*Tailored for Medicine*: JiviMed-8B is meticulously designed to cater to the specialized language and knowledge requirements of the medical and life sciences industries. It has been fine-tuned using an extensive collection of high-quality biomedical data, enhancing its ability to accurately comprehend and generate domain-specific text.

*Unmatched Performance*: With 8 billion parameters, JiviMed-8B outperforms other open-source biomedical language models of similar size. It demonstrates superior results over larger models, both proprietary and open-source, such as GPT-3.5, Meditron-70B, and Gemini 1.0, in various biomedical benchmarks.

*Enhanced Training Methodologies*: JiviMed-8B builds upon the robust frameworks of the Meta-Llama-3-8B models, integrating a specially curated diverse medical dataset along with ORPO fine-tuning strategy. Key elements of our training process include:

    1. Intensive Data Preparation: Over 100,000+ data points have been meticulously curated to ensure the model is well-versed in the nuances of biomedical language.
    2. Hyperparameter Tuning: Hyperparameter adjustments are carefully optimized to enhance learning efficiency without encountering catastrophic forgetting, thus maintaining robust performance across tasks.

JiviMed-8B redefines what's possible in biomedical language modeling, setting new standards for accuracy, versatility, and performance in the medical domain.


## Model Comparison

| Model Name                                         | Average | MedMCQA | MedQA | MMLU Anatomy | MMLU Clinical Knowledge | MMLU College Biology | MMLU College Medicine | MMLU Medical Genetics | MMLU Professional Medicine | PubMedQA |
|----------------------------------------------------|---------|---------|-------|--------------|------------------------|----------------------|-----------------------|------------------------|------------------------------|----------|
| **Jivi_medium_med_v1**                             | 75.53   | 60.1    | 60.04 | 77.04        | 82.26                  | 86.81                | 73.41                 | 86                     | 80.08                        | 72.6     |
| Flan:PaLM                                          | 74.7    | 57.6    | 67.6  | 63.7         | 80.4                   | 88.9                 | 76.3                  | 75                     | 83.8                         | 79       |
| winninghealth/WiNGPT2-Llama-3-8B-Base              | 72.1    | 55.65   | 67.87 | 69.63        | 75.09                  | 78.47                | 65.9                  | 84                     | 78.68                        | 73.6     |
| meta-llama/Meta-Llama-3-8B                         | 69.9    | 57.47   | 59.7  | 68.89        | 74.72                  | 78.47                | 61.85                 | 83                     | 70.22                        | 74.8     |
| meta-llama/Meta-Llama-3-8B                         | 69.81   | 57.69   | 60.02 | 68.89        | 74.72                  | 78.47                | 60.12                 | 83                     | 70.22                        | 75.2     |
| unsloth/gemma-7b                                   | 64.18   | 48.96   | 47.21 | 59.26        | 69.81                  | 79.86                | 60.12                 | 70                     | 66.18                        | 76.2     |
| mistralai/Mistral-7B-V9.1                          | 62.85   | 48.2    | 50.82 | 55.56        | 68.68                  | 68.06                | 59.54                 | 71                     | 68.38                        | 75.4     |
| BioMistral/BioMistral-7B-Zephyr-Beta-SLeRP         | 61.52   | 46.52   | 50.2  | 55.56        | 63.02                  | 65.28                | 61.27                 | 72                     | 63.24                        | 76.6     |
| BioMistral/BioMistral-7B-SLERP                     | 59.58   | 44.13   | 47.29 | 51.85        | 66.42                  | 65.28                | 58.96                 | 69                     | 55.88                        | 77.4     |
| BioMistral/BioMistral-7B-DARE                      | 59.45   | 44.66   | 47.37 | 53.33        | 66.42                  | 62.5                 | 58.96                 | 68                     | 56.25                        | 77.6     |
| OpenModel s4all/gemma-1-7b-it                      | 58.37   | 44.56   | 45.01 | 52.59        | 62.64                  | 68.75                | 57.23                 | 67                     | 55.15                        | 72.4     |
| medalpaca/medalpaca-7b                             | 58.03   | 37.51   | 41.71 | 57.04        | 57.36                  | 65.28                | 54.34                 | 69                     | 67.28                        | 72.8     |
| BioMistral/BioMistral-7B                           | 56.36   | 41.48   | 46.11 | 51.11        | 63.77                  | 61.11                | 53.76                 | 66                     | 52.94                        | 71       |


![model_accuracy](https://cdn-uploads.huggingface.co/production/uploads/65d31285220242a508a30523/sBHSX5Z0n0V1jTUpAxzX8.png)


<details>

<summary>Hyperparametes:</summary>

Peft
* lora_r: 64
* lora_alpha: 128
* lora_dropout: 0.05
* lora_target_linear: true

Target_Modules
* q_proj
* v_proj
* k_proj
* o_proj
* gate_proj
* down_proj
* up_proj
</details>