---
base_model:
- Qwen/Qwen2.5-32B-Instruct
datasets:
- Thaweewat/thai-med-pack
language:
- th
- en
library_name: transformers
pipeline_tag: text-generation
tags:
- text-generation-inference
- sft
- trl
- 4-bit precision
- bitsandbytes
- LoRA
- Fine-Tuning with LoRA
- LLM
- GenAI
- medical
- medtech
- HealthGPT
- NT Academy
- minddatatech
---

# 🇹🇭 **Model Card for Qwen2.5-32B-Instruct-medical-tuned**
<!-- Provide a quick summary of what the model is/does. -->

## <font color="red">ℹ️ This version is significantly better than OpenThaiGPT!!.</font>

## Qwen2.5-32B-Instruct for Thai Medical QA

This model is fine-tuned from `Qwen2.5-32B-Instruct` using Supervised Fine-Tuning (SFT) on the `Thaweewat/thai-med-pack` dataset. It is designed for medical question-answering tasks in Thai, providing accurate and contextual answers based on medical information.

## Model Description
This model was fine-tuned using Supervised Fine-Tuning (SFT) to enhance its capabilities for medical question answering in Thai. The base model is `Qwen2.5-32B-Instruct`, which has been optimized with domain-specific knowledge using the `Thaweewat/thai-med-pack` dataset.

- **Model type:** Causal Language Model (AutoModelForCausalLM)
- **Language(s):** Thai
- **Fine-tuned from model:** Qwen2.5-32B-Instruct
- **Dataset used for fine-tuning:** Thaweewat/thai-med-pack

### Model Sources

- **Repository:** https://huggingface.co/amornpan
- **Citing Repository:** https://huggingface.co/Aekanun
- **Base Model:** https://huggingface.co/Qwen/Qwen2.5-32B-Instruct
- **Dataset:** https://huggingface.co/datasets/Thaweewat/thai-med-pack

## Uses

### Direct Use
The model can be used directly for generating medical responses in Thai. It has been optimized for:
- Medical question-answering
- Providing clinical information
- Health-related dialogue generation

### Downstream Use
This model serves as a foundational model for medical assistance systems, chatbots, and applications related to healthcare in the Thai language.

### Out-of-Scope Use
- This model should not be used for real-time diagnosis or emergency medical scenarios.
- It should not be relied upon for critical clinical decisions without human oversight, as it is not intended to replace professional medical advice.

## Bias, Risks, and Limitations

### Bias
- The model may reflect biases present in the dataset, especially regarding underrepresented medical conditions or topics.

### Risks
- Responses may contain inaccuracies due to the model's inherent limitations and the dataset used for fine-tuning.
- The model should not be used as the sole source of medical advice.

### Limitations
- Primarily limited to the medical domain.
- Sensitive to prompts and may generate off-topic responses for non-medical queries.

# Model Training Statistics

## Training Summary
- **Total Steps:** 1050
- **Total Epochs:** 98.25
- **Validation Checks:** 42
- **Epoch with Lowest Validation Loss:** 93.57

## Performance Improvement
- **Training Loss Reduction:** 45.32%
- **Validation Loss Reduction:** 35.97%
- **Final Training Loss:** 1.0060
- **Lowest Validation Loss:** 1.1385

## Loss Values
| Metric | Initial | Final | Minimum |
|--------|---------|-------|---------|
| **Training Loss** | 1.8398 | 1.0060 | 1.0060 |
| **Validation Loss** | 1.7782 | 1.1386 | 1.1385 |


## Model Training Results:

![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/umzKEBp8lxBCp4nEieIIl.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/Z0pU0MVz4AhSq3B5dT_fn.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/c_lqB3jiJl_Os-l7j-7NB.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/IwyhjvmDO5WdQZvZEWT9J.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/gXjnNDSPw01VEWTBbr-2Z.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/V5WPYa27EOiEcxelgBJEd.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/KKM2qxjbnsu-ixImTWJgu.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/sTE-lYpLR9YLG3b8OdCdt.png)
![image/png](https://cdn-uploads.huggingface.co/production/uploads/663ce15f197afc063058dc3a/jYV5qz_ZPFvZW7P-Q1BGy.png)

## How to Get Started with the Model

This section provides a step-by-step guide to loading and using the model for generating medical responses in Thai.


### 1. Install the Required Packages

```python
#Import necessary libraries for working with the model
from peft import PeftModel, PeftConfig
from transformers import AutoModelForCausalLM, AutoTokenizer
```

### 2. Load Model and Tokenizer
```
#Set the model to amornpan/V3_qwen2.5-32b-med-thai-optimized
base_model_name = "amornpan/V3_qwen2.5-32b-med-thai-optimized"

# Load the model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    base_model_name,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained(base_model_name, trust_remote_code=True)
```

### 3. Define Test Function
```
# Function for testing the model
def test_model(prompt, max_new_tokens=256):
    system_prompt = "You are a question answering assistant. Answer the question as truthful and helpful as possible. คุณคือผู้ช่วยตอบคำถาม จงตอบคำถามอย่างถูกต้องและมีประโยชน์ที่สุด"
    full_prompt = f"<s><|im_start|>system\n{system_prompt}<|im_end|>\n<|im_start|>user\n{prompt}<|im_end|>\n<|im_start|>assistant\n"
    
    inputs = tokenizer(full_prompt, return_tensors="pt").to(model.device)
    
    # Generate
    generated_ids = model.generate(
        **inputs,
        max_new_tokens=max_new_tokens,
        do_sample=True,
        temperature=0.7,
        top_p=0.9,
    )
    
    response = tokenizer.decode(generated_ids[0], skip_special_tokens=False)
    # Extract only the assistant's response
    assistant_response = response.split("<|im_start|>assistant\n")[-1].split("<|im_end|>")[0]
    
    return assistant_response
```

### 4. Test with an Example Question
# Show an example question
```
# Show an example question
example_question = "อาการของโรคเบาหวานมีอะไรบ้าง"
print(f"\nคำถาม: {example_question}")
response = test_model(example_question)
print(f"คำตอบ: {response}")
```

### 5. Output
```python
คำถาม: การรักษาโรคความดันโลหิตสูงทำอย่างไร
คำตอบ: สวัสดี ความดันโลหิตสูงสามารถรักษาได้โดยการใช้ยาหลายชนิด เช่น เบนโซเพอรีซิน, อะโมโลนิด, ลิโซโปรตาซอล, อีลาฟอร์เท็ต,
อัลฟูราลิท, อะเซติซิลดิโพราเมต, อาราคานา, อาเนอโรนิก, อาเซติซิลสัมพันธ์, อาเนอโรนิก, อะเซติซิลสัมพันธ์ เป็นต้น
คุณสามารถปรึกษาแพทย์ผู้เชี่ยวชาญในเรื่องนี้เพื่อทราบข้อมูลเพิ่มเติมเกี่ยวกับยาดังกล่าว หวังว่าคำตอบของฉันจะเป็นประโยชน์สำหรับคุณ
ขอให้คุณมีสุขภาพที่ดี ขอบคุณที่เลือกใช้บริการของเรา หากคุณมีคำถามใด ๆ

คำถาม: ยา Paracetamol มีผลข้างเคียงอะไรบ้าง
คำตอบ: Paracetamol เป็นยาแก้ปวดและลดไข้ที่ใช้กันอย่างแพร่หลาย ซึ่งมีผลข้างเคียงน้อยกว่ายาแก้ปวดชนิดอื่นๆ โดยทั่วไปจะไม่มีผลข้างเคียงใดๆ
หากใช้ในขนาดที่แนะนำ แต่อาจพบได้ เช่น ปวดท้อง อาเจียน และรู้สึกคลื่นไส้ นอกจากนี้ หากใช้ในปริมาณที่มากเกินไป อาจทำให้มีอาการปัสสาวะขุ่น
มีสีเหลืองเข้ม เบื่ออาหาร คลื่นไส้ อาเจียน ปวดท้อง ปวดหัว ตาเหลือง หรือปัสสาวะสีเข้มเป็นสีชาโคล่า
หากมีอาการดังกล่าวควรหยุดการใช้ยาและรีบไปพบแพทย์เพื่อตรวจหาความเสียหายของตับจากยา
โดยการตรวจการทำงานของตับ ซึ่งหากพบว่ามีอาการของโรคตับวายเฉียบพลัน
```

### 👤 **Authors**

* Amornpan Phornchaicharoen (amornpan@gmail.com)
* Aekanun Thongtae (cto@bangkokfirsttech.com)
* Montita Somsoo (montita.fonn@gmail.com)