pseudolab
/

K23_MiniMed

Summarization

Adapters

TensorBoard

Safetensors

English

medical

Model card Files Files and versions Metrics Training metrics Community

Tonic commited on Nov 1, 2023

Commit

3701a4c

1 Parent(s): ec009a9

Update README.md

Browse files

Files changed (1) hide show

README.md +190 -112

README.md CHANGED Viewed

@@ -7,69 +7,52 @@ language:
 library_name: adapter-transformers
 ---
-# Model Card for {{ model_id | default("Model ID", true) }}
-<!-- Provide a quick summary of what the model is/does. -->
-{{ model_summary | default("", true) }}
 ## Model Details
 ### Model Description
-<!-- Provide a longer summary of what this model is. -->
-{{ model_description | default("", true) }}
-- **Developed by:** {{ developers | default("[More Information Needed]", true)}}
-- **Funded by [optional]:** {{ funded_by | default("[More Information Needed]", true)}}
-- **Shared by [optional]:** {{ shared_by | default("[More Information Needed]", true)}}
-- **Model type:** {{ model_type | default("[More Information Needed]", true)}}
-- **Language(s) (NLP):** {{ language | default("[More Information Needed]", true)}}
-- **License:** {{ license | default("[More Information Needed]", true)}}
-- **Finetuned from model [optional]:** {{ finetuned_from | default("[More Information Needed]", true)}}
 ### Model Sources [optional]
-<!-- Provide the basic links for the model. -->
-- **Repository:** {{ repo | default("[More Information Needed]", true)}}
-- **Paper [optional]:** {{ paper | default("[More Information Needed]", true)}}
 - **Demo [optional]:** {{ demo | default("[More Information Needed]", true)}}
 ## Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
 ### Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-{{ direct_use | default("[More Information Needed]", true)}}
 ### Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-{{ downstream_use | default("[More Information Needed]", true)}}
-### Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-{{ out_of_scope_use | default("[More Information Needed]", true)}}
-## Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-{{ bias_risks_limitations | default("[More Information Needed]", true)}}
 ### Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-{{ bias_recommendations | default("Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.", true)}}
 ## How to Get Started with the Model
@@ -79,68 +62,50 @@ Use the code below to get started with the model.
 ## Training Details
-### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-{{ training_data | default("[More Information Needed]", true)}}
-### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-#### Preprocessing [optional]
-{{ preprocessing | default("[More Information Needed]", true)}}
-#### Training Hyperparameters
-- **Training regime:** {{ training_regime | default("[More Information Needed]", true)}} <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
-#### Speeds, Sizes, Times [optional]
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-{{ speeds_sizes_times | default("[More Information Needed]", true)}}
-## Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-### Testing Data, Factors & Metrics
-#### Testing Data
-<!-- This should link to a Dataset Card if possible. -->
-{{ testing_data | default("[More Information Needed]", true)}}
-#### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-{{ testing_factors | default("[More Information Needed]", true)}}
-#### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-{{ testing_metrics | default("[More Information Needed]", true)}}
 ### Results
-{{ results | default("[More Information Needed]", true)}}
-#### Summary
-{{ results_summary | default("", true) }}
-## Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-{{ model_examination | default("[More Information Needed]", true)}}
 ## Environmental Impact
@@ -154,50 +119,163 @@ Carbon emissions can be estimated using the [Machine Learning Impact calculator]
 - **Compute Region:** {{ cloud_region | default("[More Information Needed]", true)}}
 - **Carbon Emitted:** {{ co2_emitted | default("[More Information Needed]", true)}}
-## Technical Specifications [optional]
 ### Model Architecture and Objective
-{{ model_specs | default("[More Information Needed]", true)}}
 ### Compute Infrastructure
-{{ compute_infrastructure | default("[More Information Needed]", true)}}
 #### Hardware
-{{ hardware | default("[More Information Needed]", true)}}
 #### Software
-{{ software | default("[More Information Needed]", true)}}
-## Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
-**BibTeX:**
-{{ citation_bibtex | default("[More Information Needed]", true)}}
-**APA:**
-{{ citation_apa | default("[More Information Needed]", true)}}
-## Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-{{ glossary | default("[More Information Needed]", true)}}
-## More Information [optional]
-{{ more_information | default("[More Information Needed]", true)}}
 ## Model Card Authors [optional]
-{{ model_card_authors | default("[More Information Needed]", true)}}
 ## Model Card Contact
-{{ model_card_contact | default("[More Information Needed]", true)}}

 library_name: adapter-transformers
 ---
+# Model Card for K23 MiniMed
+This is a Mistral 7b Beta Medical Fine Tune with a short number of steps , inspired by [Wonhyeong Seo](https://www.huggingface.co/wseo) great mentorship during Krew x Huggingface 2023 hackathon.
 ## Model Details
 ### Model Description
+- **Developed by:** [Tonic](https://huggingface.co/Tonic)
+- **Funded by [optional]:** [Tonic](https://huggingface.co/Tonic)
+- **Shared by [optional]:** K23-Krew-Hackathon
+- **Model type:** Mistral 7B-Beta Medical Fine Tune
+- **Language(s) (NLP):** English
+- **License:** MIT
+- **Finetuned from model [optional]:** [Zephyr 7B-Beta](https://huggingface.co/HuggingFaceH4/zephyr-7b-beta)
 ### Model Sources [optional]
+- **Repository:** [github](https://github.com/Josephrp/AI-challenge-hackathon/blob/master/mistral7b-beta_finetune.ipynb)
 - **Demo [optional]:** {{ demo | default("[More Information Needed]", true)}}
 ## Uses
+Use this model for conversational applications for medical question and answering **for educational purposes only** !
 ### Direct Use
+Make a gradio chatbot app to ask medical questions and get answers conversationaly.
 ### Downstream Use [optional]
+This model is **for educational use only** .
+Further fine tunes and uses would include :
+- public health & sanitation
+- personal health & sanitation
+- medical Q & A
 ### Recommendations
+- always evaluate this model before use
+- always benchmark this model before use
+- always evaluate bias before use
+- do not use as is, fine tune further
 ## How to Get Started with the Model
 ## Training Details
+| Step | Training Loss |
+|------|--------------|
+| 50   | 0.993800     |
+| 100  | 0.620600     |
+| 150  | 0.547100     |
+| 200  | 0.524100     |
+| 250  | 0.520500     |
+| 300  | 0.559800     |
+| 350  | 0.535500     |
+| 400  | 0.505400     |
+### Training Data
+```json
+{trainable params: 21260288 || all params: 3773331456 || trainable%: 0.5634354746703705}
+```
+### Training Procedure
+#### Preprocessing [optional]
+Lora32bits
+#### Speeds, Sizes, Times [optional]
+```json
+ metrics={'train_runtime': 1700.1608, 'train_samples_per_second': 1.882, 'train_steps_per_second': 0.235, 'total_flos': 9.585300996096e+16, 'train_loss': 0.6008514881134033, 'epoch': 0.2})
+```
 ### Results
+```json
+TrainOutput
+global_step=400, training_loss=0.6008514881134033
+```
+#### Summary
 ## Environmental Impact
 - **Compute Region:** {{ cloud_region | default("[More Information Needed]", true)}}
 - **Carbon Emitted:** {{ co2_emitted | default("[More Information Needed]", true)}}
+## Technical Specifications
 ### Model Architecture and Objective
+```python
+PeftModelForCausalLM(
+  (base_model): LoraModel(
+    (model): MistralForCausalLM(
+      (model): MistralModel(
+        (embed_tokens): Embedding(32000, 4096)
+        (layers): ModuleList(
+          (0-31): 32 x MistralDecoderLayer(
+            (self_attn): MistralAttention(
+              (q_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=4096, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=4096, bias=False)
+              )
+              (k_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=1024, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=1024, bias=False)
+              )
+              (v_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=1024, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=1024, bias=False)
+              )
+              (o_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=4096, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=4096, bias=False)
+              )
+              (rotary_emb): MistralRotaryEmbedding()
+            )
+            (mlp): MistralMLP(
+              (gate_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=14336, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=14336, bias=False)
+              )
+              (up_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=4096, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=14336, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=4096, out_features=14336, bias=False)
+              )
+              (down_proj): Linear4bit(
+                (lora_dropout): ModuleDict(
+                  (default): Dropout(p=0.05, inplace=False)
+                )
+                (lora_A): ModuleDict(
+                  (default): Linear(in_features=14336, out_features=8, bias=False)
+                )
+                (lora_B): ModuleDict(
+                  (default): Linear(in_features=8, out_features=4096, bias=False)
+                )
+                (lora_embedding_A): ParameterDict()
+                (lora_embedding_B): ParameterDict()
+                (base_layer): Linear4bit(in_features=14336, out_features=4096, bias=False)
+              )
+              (act_fn): SiLUActivation()
+            )
+            (input_layernorm): MistralRMSNorm()
+            (post_attention_layernorm): MistralRMSNorm()
+          )
+        )
+        (norm): MistralRMSNorm()
+      )
+      (lm_head): Linear(
+        in_features=4096, out_features=32000, bias=False
+        (lora_dropout): ModuleDict(
+          (default): Dropout(p=0.05, inplace=False)
+        )
+        (lora_A): ModuleDict(
+          (default): Linear(in_features=4096, out_features=8, bias=False)
+        )
+        (lora_B): ModuleDict(
+          (default): Linear(in_features=8, out_features=32000, bias=False)
+        )
+        (lora_embedding_A): ParameterDict()
+        (lora_embedding_B): ParameterDict()
+      )
+    )
+  )
+)
+```
 ### Compute Infrastructure
 #### Hardware
+A100
 #### Software
+peft , torch, bitsandbytes, python, huggingface
 ## Model Card Authors [optional]
+[Tonic](https://huggingface.co/Tonic)
 ## Model Card Contact
+[Tonic](https://huggingface.co/Tonic)