File size: 2,802 Bytes
d199afb
5b48d17
 
 
 
 
 
 
 
 
 
 
 
d199afb
5b48d17
 
d199afb
 
5b48d17
d199afb
5b48d17
d199afb
5b48d17
d199afb
5b48d17
 
 
 
 
 
d199afb
 
 
5b48d17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
language:
- en
- ko
tags:
- qwen
- lora
- rag
- instruction-tuning
- email
- qwen-2.5
- peft
- question-answering
library_name: peft
pipeline_tag: text-generation
license: mit
---

# Qwen-RAG-LoRA

This repository contains LoRA weights for [Qwen/Qwen2.5-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct), fine-tuned for email-based question answering tasks. The model has been trained to handle both English and Korean queries about email content.

## Model Description

- **Base Model:** Qwen/Qwen2.5-7B-Instruct
- **Training Type:** LoRA (Low-Rank Adaptation)
- **Checkpoint:** checkpoint-600
- **Languages:** English and Korean
- **Task:** Email-based Question Answering
- **Domain:** Email Content

## Training Details

### LoRA Configuration
```python
lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)
```

## Usage with vLLM

```python
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

# Initialize LLM with LoRA support
llm = LLM(
    model="Qwen/Qwen2.5-7B-Instruct",
    tensor_parallel_size=2,
    enable_lora=True
)

sampling_params = SamplingParams(temperature=0.0, max_tokens=50)

# Create LoRA request
lora_request = LoRARequest(
    "rag_adapter",
    1,
    "doubleyyh/qwen-rag-lora"  # HuggingFace repo name
)

# Example prompt
prompt = """Using the context provided below, answer the question concisely. Respond in Korean if the question is in Korean, and in English if the question is in English.

Context: subject: Meeting Schedule Update
from: [['John Smith', '[email protected]']]
to: [['Team', '[email protected]']]
text_body: The project review meeting is rescheduled to 3 PM tomorrow.

Question: When is the meeting rescheduled to?

Answer: """

# Generate with LoRA
outputs = llm.generate([prompt], sampling_params, lora_request=lora_request)
print(outputs[0].outputs[0].text)
```

## Example Input/Output

```
# English Query
Q: When is the project review scheduled?
A: The project review meeting is rescheduled to 3 PM tomorrow.

# Korean Query
Q: ํ”„๋กœ์ ํŠธ ๋ฏธํŒ…์ด ์–ธ์ œ๋กœ ๋ณ€๊ฒฝ๋˜์—ˆ๋‚˜์š”?
A: ๋‚ด์ผ ์˜คํ›„ 3์‹œ๋กœ ๋ณ€๊ฒฝ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.
```

## Limitations

- The model is specifically trained for email-related queries
- Performance might vary between English and Korean
- Optimal results when used with email content in standard format
- Limited to the capabilities of the base Qwen model

## Citation

```bibtex
@misc{qwen-rag-lora,
  author = {doubleyyh},
  title = {Qwen-RAG-LoRA: Fine-tuned LoRA Weights for Email QA},
  year = {2024},
  publisher = {Hugging Face}
}
```

## License

This model follows the same license as Qwen2.5-7B-Instruct.