Create README.md
#2
by
ZijieLei
- opened
README.md
ADDED
@@ -0,0 +1,190 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
datasets:
|
3 |
+
- ulab-ai/FusionBench
|
4 |
+
---
|
5 |
+
# Fusing LLM Capabilities with Routing Data
|
6 |
+
|
7 |
+
<p align="center">
|
8 |
+
<a href="https://ulab-uiuc.github.io/FusionFactory/">
|
9 |
+
<img alt="Project Page" src="https://img.shields.io/badge/Project-Page-blue">
|
10 |
+
</a>
|
11 |
+
<a href="http://arxiv.org/abs/2507.10540">
|
12 |
+
<img alt="arXiv" src="https://img.shields.io/badge/arXiv-2507.10540-red?logo=arxiv">
|
13 |
+
</a>
|
14 |
+
<!-- <a href="xxx">
|
15 |
+
<img alt="Twitter" src="https://img.shields.io/badge/Twitter-black?logo=X">
|
16 |
+
</a> -->
|
17 |
+
<a href="https://github.com/ulab-uiuc/FusionFactory/blob/master/LICENSE">
|
18 |
+
<img alt="License" src="https://img.shields.io/badge/LICENSE-MIT-green">
|
19 |
+
</a>
|
20 |
+
<br>
|
21 |
+
<a href="https://github.com/ulab-uiuc/FusionFactory">
|
22 |
+
<img alt="Stars" src="https://img.shields.io/github/stars/ulab-uiuc/FusionFactory">
|
23 |
+
</a>
|
24 |
+
<a href="https://github.com/ulab-uiuc/FusionFactory">
|
25 |
+
<img alt="Forks" src="https://img.shields.io/github/forks/ulab-uiuc/FusionFactory">
|
26 |
+
</a>
|
27 |
+
<a href="https://github.com/ulab-uiuc/FusionFactory">
|
28 |
+
<img alt="Issues" src="https://img.shields.io/github/issues/ulab-uiuc/FusionFactory">
|
29 |
+
</a>
|
30 |
+
</p>
|
31 |
+
|
32 |
+
<p align="center">
|
33 |
+
<a href="https://ulab-uiuc.github.io/FusionFactory/">🌐 Project Page</a> |
|
34 |
+
<a href="http://arxiv.org/abs/2507.10540">📜 arXiv</a> |
|
35 |
+
<a href="https://huggingface.co/datasets/ulab-ai/FusionBench">📂 Dataset</a> |
|
36 |
+
<a href="https://huggingface.co/ulab-ai/FusionFactory">🤖 Model</a> |
|
37 |
+
<a href="https://huggingface.co/spaces/ulab-ai/RoutePilot">🖥️ Demo</a>
|
38 |
+
</p>
|
39 |
+
|
40 |
+
|
41 |
+
|
42 |
+
|
43 |
+
<div align="center">
|
44 |
+
<img src="./figures/fusion.jpg" width="700" alt="FusionBench">
|
45 |
+
<p><b>Overview of LLM capability fusion via FusionFactory with three representative levels: Query-level, Thought-level, and Model-level.</b></p>
|
46 |
+
</div>
|
47 |
+
|
48 |
+
|
49 |
+
## News
|
50 |
+
|
51 |
+
**[2025.06]** 🌟 **FusionFactory** was released.
|
52 |
+
|
53 |
+
|
54 |
+
|
55 |
+
## 🛠️Environment Setup
|
56 |
+
|
57 |
+
```bash
|
58 |
+
conda create -n fusionfactory python=3.9
|
59 |
+
conda activate fusionfactory
|
60 |
+
pip install pandas
|
61 |
+
pip install datasets
|
62 |
+
pip install tqdm
|
63 |
+
pip install transformers
|
64 |
+
pip install sentence_transformers
|
65 |
+
pip install torch
|
66 |
+
pip install numpy
|
67 |
+
```
|
68 |
+
|
69 |
+
|
70 |
+
|
71 |
+
## 🎯Data Process
|
72 |
+
|
73 |
+
Run the following command to start data collection.
|
74 |
+
|
75 |
+
```bash
|
76 |
+
# split: train OR test
|
77 |
+
# case num: 500 for train & 50 for partial test
|
78 |
+
# a sample of LLM description: ./data_process/LLM_Descriptions.json
|
79 |
+
python data_process/data_combine.py \
|
80 |
+
--split train \
|
81 |
+
--case_num 500 \
|
82 |
+
--round 5 \
|
83 |
+
--llm_description_path [YOUR_LLM_PATH] \
|
84 |
+
--csv_save_path [YOUR_SAVE_PATH] \
|
85 |
+
--api_base [YOUR_API_BASE] \
|
86 |
+
--api_key [YOUR_API_KEY]
|
87 |
+
```
|
88 |
+
|
89 |
+
You may refer to the specific README in the [`data_process`](data_process/README.md) directory for detailed argument descriptions.
|
90 |
+
|
91 |
+
To add quality scores to the collected data using an LLM judge:
|
92 |
+
|
93 |
+
```bash
|
94 |
+
python data_process/add_llm_judge.py
|
95 |
+
```
|
96 |
+
|
97 |
+
This will evaluate each response and add quality scores to the dataset, which can be used for training and evaluation purposes. See the [`data_process/README.md`](data_process/README.md) for more details.
|
98 |
+
|
99 |
+
|
100 |
+
|
101 |
+
|
102 |
+
## 📊Experiments
|
103 |
+
|
104 |
+
|
105 |
+
### Query-level Fusion
|
106 |
+
|
107 |
+
First, run the data preprocessing script to prepare the dataset:
|
108 |
+
|
109 |
+
```bash
|
110 |
+
# Preprocess the dataset and generate training/testing files
|
111 |
+
python query_level/data_processing.py
|
112 |
+
```
|
113 |
+
|
114 |
+
For more detailed information about the data preprocessing and model training process, please refer to the specific README in the [`query_level`](query_level/README.md) directory.
|
115 |
+
|
116 |
+
|
117 |
+
|
118 |
+
### Thought-level Fusion
|
119 |
+
First, run the data preprocessing script to prepare the thought prompts:
|
120 |
+
|
121 |
+
```bash
|
122 |
+
# Preprocess the dataset and generate training/testing files
|
123 |
+
python query_level/data_processing.py
|
124 |
+
```
|
125 |
+
|
126 |
+
Or run the script to directly use Huggingface datasets to generate thought-enhanced queries
|
127 |
+
|
128 |
+
```bash
|
129 |
+
python thought_level/get_thought_prompt.py
|
130 |
+
```
|
131 |
+
|
132 |
+
For more detailed information about the data preprocessing and model training process, please refer to the specific README in the [`thought_level`](thought_level/README.md) directory.
|
133 |
+
|
134 |
+
|
135 |
+
### Model-level Fusion
|
136 |
+
|
137 |
+
You can refer to [LLaMA-Factory](https://github.com/hiyouga/LLaMA-Factory) for detailed instructions to start fine-tuning on model-level fusion data. Make sure to first clone the LLaMA-Factory repository into the FusionBench directory, and then execute the following commands to generate SFT data for model-level fusion:
|
138 |
+
|
139 |
+
|
140 |
+
```bash
|
141 |
+
# setting: perf, judge, hybrid, baseline
|
142 |
+
python model_level/sft_data_gen.py --settin perf --k 5 --save_path [YOUR_PATH] --csv_path_with_judge [YOUR_PATH]
|
143 |
+
|
144 |
+
python model_level/sft_test_gen.py --save_path [YOUR_PATH] --csv_path [YOUR_PATH]
|
145 |
+
```
|
146 |
+
|
147 |
+
Then, you can use the following commands to start SFT and Inference after essential configuration described in [LLaMA-Factory Doc](https://llamafactory.readthedocs.io/en/latest/)
|
148 |
+
|
149 |
+
```bash
|
150 |
+
# SFT
|
151 |
+
FORCE_TORCHRUN=1 CUDA_VISIBLE_DEVICES=2,3,4,5 llamafactory-cli train examples/train_lora/[YOUR_YAML].yaml
|
152 |
+
|
153 |
+
# Inference
|
154 |
+
CUDA_VISIBLE_DEVICES=2,3,4,5 python scripts/vllm_infer.py --model_name_or_path meta-llama/Llama-3.1-8B-Instruct --adapter_name_or_path saves/llama3.1-8b/lora/[YOUR_PATH] --dataset router_test --cutoff_len 2048
|
155 |
+
```
|
156 |
+
|
157 |
+
|
158 |
+
You may refer to the specific README in the [`model_level`](model_level/README.md) directory for detailed instructions.
|
159 |
+
|
160 |
+
|
161 |
+
## 📈 Evaluation
|
162 |
+
|
163 |
+
FusionBench provides a comprehensive evaluation framework to assess model performance across various tasks. The evaluation framework supports multiple types of tasks including:
|
164 |
+
|
165 |
+
- Mathematical Reasoning (GSM8K, MATH)
|
166 |
+
- Code Generation (MBPP, HumanEval)
|
167 |
+
- Commonsense Reasoning (CommonsenseQA, OpenBookQA, ARC Challenge, HellaSwag)
|
168 |
+
- World Knowledge (Natural Questions, TriviaQA)
|
169 |
+
- Reading Comprehension (SQuAD, BoolQ)
|
170 |
+
- Popular Benchmarks (MMLU, GPQA)
|
171 |
+
|
172 |
+
To evaluate your model's performance:
|
173 |
+
|
174 |
+
```bash
|
175 |
+
python eval/response_eval.py
|
176 |
+
```
|
177 |
+
|
178 |
+
For detailed information about the evaluation framework, supported metrics, and usage instructions, please refer to the [Evaluation Documentation](eval/README.md).
|
179 |
+
|
180 |
+
|
181 |
+
## Citation
|
182 |
+
|
183 |
+
```bibtex
|
184 |
+
@article{FusionFactory,
|
185 |
+
title={Fusing LLM Capabilities with Routing Data},
|
186 |
+
author={Tao Feng and Haozhen Zhang and Zijie Lei and Pengrui Han and Mostofa Patwary and Mohammad Shoeybi and Bryan Catanzaro and Jiaxuan You},
|
187 |
+
journal={arXiv preprint arXiv:2507.10540},
|
188 |
+
year={2025}
|
189 |
+
}
|
190 |
+
```
|