prithivMLmods commited on
Commit
329e77d
·
verified ·
1 Parent(s): 853f720

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -22,6 +22,8 @@ tags:
22
 
23
  ### Qwen-UMLS-7B-Instruct [ Unified Medical Language System ]
24
 
 
 
25
  | **File Name** | **Size** | **Description** | **Upload Status** |
26
  |-----------------------------------------|----------------|-------------------------------------------------|--------------------|
27
  | `.gitattributes` | 1.57 kB | File to specify LFS rules for large file tracking. | Uploaded |
@@ -40,5 +42,80 @@ tags:
40
  | `tokenizer_config.json` | 7.73 kB | Configuration file for the tokenizer. | Uploaded |
41
  | `vocab.json` | 2.78 MB | Vocabulary file for tokenization. | Uploaded |
42
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
43
  ---
44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ### Qwen-UMLS-7B-Instruct [ Unified Medical Language System ]
24
 
25
+ The **Qwen-UMLS-7B-Instruct** model is a specialized, instruction-tuned language model designed for medical and healthcare-related tasks. It is fine-tuned on the **Qwen2.5-7B-Instruct** base model using the **UMLS (Unified Medical Language System)** dataset, making it an invaluable tool for medical professionals, researchers, and developers building healthcare applications.
26
+
27
  | **File Name** | **Size** | **Description** | **Upload Status** |
28
  |-----------------------------------------|----------------|-------------------------------------------------|--------------------|
29
  | `.gitattributes` | 1.57 kB | File to specify LFS rules for large file tracking. | Uploaded |
 
42
  | `tokenizer_config.json` | 7.73 kB | Configuration file for the tokenizer. | Uploaded |
43
  | `vocab.json` | 2.78 MB | Vocabulary file for tokenization. | Uploaded |
44
 
45
+ ### **Key Features:**
46
+
47
+ 1. **Medical Expertise:**
48
+ - Trained on the UMLS dataset, ensuring deep domain knowledge in medical terminology, diagnostics, and treatment plans.
49
+
50
+ 2. **Instruction-Following:**
51
+ - Designed to handle complex queries with clarity and precision, suitable for diagnostic support, patient education, and research.
52
+
53
+ 3. **High-Parameter Model:**
54
+ - Leverages 7 billion parameters to deliver detailed, contextually accurate responses.
55
+
56
+ ---
57
+
58
+ ### **Training Details:**
59
+
60
+ - **Base Model:** [Qwen2.5-7B-Instruct](#)
61
+ - **Dataset:** [avaliev/UMLS](#)
62
+ - Comprehensive dataset of medical terminologies, relationships, and use cases with 99.1k samples.
63
+ ---
64
+ ### **Capabilities:**
65
+
66
+ 1. **Clinical Text Analysis:**
67
+ - Interpret medical notes, prescriptions, and research articles.
68
+
69
+ 2. **Question-Answering:**
70
+ - Answer medical queries, provide explanations for symptoms, and suggest treatments based on user prompts.
71
+
72
+ 3. **Educational Support:**
73
+ - Assist in learning medical terminologies and understanding complex concepts.
74
+
75
+ 4. **Healthcare Applications:**
76
+ - Integrate into clinical decision-support systems or patient care applications.
77
+ ---
78
+ ### **Usage Instructions:**
79
+
80
+ 1. **Setup:**
81
+ Download all files and ensure compatibility with the Hugging Face Transformers library.
82
+
83
+ 2. **Loading the Model:**
84
+ ```python
85
+ from transformers import AutoModelForCausalLM, AutoTokenizer
86
+
87
+ model_name = "prithivMLmods/Qwen-UMLS-7B-Instruct"
88
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
89
+ model = AutoModelForCausalLM.from_pretrained(model_name)
90
+ ```
91
+
92
+ 3. **Generate Medical Text:**
93
+ ```python
94
+ input_text = "What are the symptoms and treatments for diabetes?"
95
+ inputs = tokenizer(input_text, return_tensors="pt")
96
+ outputs = model.generate(**inputs, max_length=200, temperature=0.7)
97
+ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
98
+ ```
99
+
100
+ 4. **Customizing Outputs:**
101
+ Modify `generation_config.json` to optimize output style:
102
+ - `temperature` for creativity vs. determinism.
103
+ - `max_length` for concise or extended responses.
104
+
105
  ---
106
 
107
+ ### **Applications:**
108
+
109
+ 1. **Clinical Support:**
110
+ - Assist healthcare providers with quick, accurate information retrieval.
111
+
112
+ 2. **Patient Education:**
113
+ - Provide patients with understandable explanations of medical conditions.
114
+
115
+ 3. **Medical Research:**
116
+ - Summarize or analyze complex medical research papers.
117
+
118
+ 4. **AI-Driven Diagnostics:**
119
+ - Integrate with diagnostic systems for preliminary assessments.
120
+
121
+ ---