Update README.md
Browse files
README.md
CHANGED
@@ -40,29 +40,14 @@ Model 4-bit Mistral-7B-Instruct-v0.2 finetuned with QLoRA on multiple medical da
|
|
40 |
|
41 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
42 |
|
43 |
-
|
44 |
-
|
45 |
-
<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
|
46 |
-
|
47 |
-
[More Information Needed]
|
48 |
-
|
49 |
-
### Downstream Use [optional]
|
50 |
-
|
51 |
-
<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
|
52 |
-
|
53 |
-
[More Information Needed]
|
54 |
-
|
55 |
-
### Out-of-Scope Use
|
56 |
-
|
57 |
-
<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
|
58 |
-
|
59 |
-
[More Information Needed]
|
60 |
|
61 |
## Bias, Risks, and Limitations
|
62 |
|
63 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
|
|
64 |
|
65 |
-
|
66 |
|
67 |
### Recommendations
|
68 |
|
@@ -74,14 +59,32 @@ Users (both direct and downstream) should be made aware of the risks, biases and
|
|
74 |
|
75 |
Use the code below to get started with the model.
|
76 |
|
77 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
78 |
|
79 |
## Training Details
|
|
|
|
|
80 |
|
81 |
### Training Data
|
82 |
|
83 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
84 |
-
Training data included
|
85 |
- pubmed
|
86 |
- bigbio/czi_drsm
|
87 |
- bigbio/bc5cdr
|
|
|
40 |
|
41 |
<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
|
42 |
|
43 |
+
The model is finetuned on medical data and is intended for research. However, it should not be used as a substitute for professional medical advice, diagnosis, or treatment.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
44 |
|
45 |
## Bias, Risks, and Limitations
|
46 |
|
47 |
<!-- This section is meant to convey both technical and sociotechnical limitations. -->
|
48 |
+
The model's predictions are based on the information available in the finetuned medical dataset. It may not generalize well to all medical conditions or diverse patient populations.
|
49 |
|
50 |
+
Sensitivity to variations in input data and potential biases present in the training data may impact the model's performance.
|
51 |
|
52 |
### Recommendations
|
53 |
|
|
|
59 |
|
60 |
Use the code below to get started with the model.
|
61 |
|
62 |
+
```python
|
63 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
64 |
+
|
65 |
+
tokenizer = AutoTokenizer.from_pretrained("adriata/med_mistral")
|
66 |
+
model = AutoModelForCausalLM.from_pretrained("adriata/med_mistral")
|
67 |
+
|
68 |
+
prompt_template = """<s>[INST] {prompt} [/INST]"""
|
69 |
+
|
70 |
+
prompt = "What is influenza?"
|
71 |
+
|
72 |
+
model_inputs = tokenizer.encode(prompt_template.format(prompt=prompt),
|
73 |
+
return_tensors="pt").to("cuda")
|
74 |
+
|
75 |
+
generated_ids = model.generate(model_inputs, max_new_tokens=512, do_sample=True)
|
76 |
+
decoded = tokenizer.batch_decode(generated_ids)
|
77 |
+
print(decoded[0])
|
78 |
+
```
|
79 |
|
80 |
## Training Details
|
81 |
+
~13h - 20k examples x 1 epoch
|
82 |
+
GPU: OVH - 1 × NVIDIA TESLA V100S (32 GiB RAM)
|
83 |
|
84 |
### Training Data
|
85 |
|
86 |
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
|
87 |
+
Training data included 20k examples randomly selected from datasets:
|
88 |
- pubmed
|
89 |
- bigbio/czi_drsm
|
90 |
- bigbio/bc5cdr
|