Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,101 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
---
|
4 |
+
# Make Some Noise (MSN) Framework
|
5 |
+
Implementation of EMNLP 2024 paper [Make Some Noise: Unlocking Language Model Parallel Inference
|
6 |
+
Capability through Noisy Training](https://arxiv.org/pdf/2406.17404).
|
7 |
+
[[Github]](https://github.com/wyxstriker/MakeSomeNoiseInference)
|
8 |
+
|
9 |
+
## Requirements
|
10 |
+
- Environment: We adopt the same environment as used in [Spec-Bench](https://github.com/hemingkx/Spec-Bench) to facilitate a fair and consistent evaluation.
|
11 |
+
- Prepared Models: For convenience of testing, we release the weights of both the general-purpose model [[Llama3-8B-MSN](https://huggingface.co/DecoderImmortal/Llama3-8B-MSN)] and the code-specific model [[DeepSeek-Coder-7B-MSN](https://huggingface.co/DecoderImmortal/DeepSeek-Coder-7B-MSN)] trained on MSN as discussed in the paper.
|
12 |
+
|
13 |
+
## A minimal implementation of MSN
|
14 |
+
|
15 |
+
The MSN framework can be easily integrated into the data preprocessing stage of any training script. The entire noise addition process is as follows:
|
16 |
+
|
17 |
+
```python
|
18 |
+
# L denotes the noise length hyperparameter, which is typically set to 5.
|
19 |
+
dataset = [
|
20 |
+
{"source_ids": "Query prompt.",
|
21 |
+
"input_ids": "Concatenation of the query and response.",
|
22 |
+
"output_ids": "Copy of input_ids as label for LM task."}
|
23 |
+
]
|
24 |
+
for source_ids, input_ids in dataset:
|
25 |
+
start_idx = random.randrange(len(source_ids), len(input_ids)-L)
|
26 |
+
for mask_i in range(start_idx, start_idx+L):
|
27 |
+
# Noise is added only to the input portion corresponding to the response.
|
28 |
+
input_ids[mask_i] = random.choice(input_ids[:mask_i])
|
29 |
+
|
30 |
+
```
|
31 |
+
|
32 |
+
## TR-Jacobi
|
33 |
+
|
34 |
+
<div align="center">
|
35 |
+
<img src="./pic/tr-jacobi.png" width="50%"/>
|
36 |
+
</div>
|
37 |
+
|
38 |
+
We demonstrate how to use TR-Jacobi to accelerate the MSN-trained model in ```src/inference_msn.py```.
|
39 |
+
|
40 |
+
```python
|
41 |
+
# jacobi decoding
|
42 |
+
spec_res_ids, new_tokens, forward_steps, accpet_list = noise_forward(input_ids.cuda(), model, tokenizer, args.max_new_tokens)
|
43 |
+
|
44 |
+
print("msn output")
|
45 |
+
print(tokenizer.decode(spec_res_ids[0]))
|
46 |
+
print("#MTA")
|
47 |
+
print(new_tokens/forward_steps)
|
48 |
+
print("Accepted Length List")
|
49 |
+
print(accpet_list)
|
50 |
+
|
51 |
+
# msn output
|
52 |
+
# <|begin_of_text|><|start_header_id|>system<|end_header_id|>
|
53 |
+
# Give me some advices about how to write an academic paper?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
|
54 |
+
# 1. Start by researching your topic and gathering relevant information. Make sure to take notes and organize your research in a way that makes sense.
|
55 |
+
# ...
|
56 |
+
# 8. Submit your paper. Make sure to follow any submission guidelines and make sure to submit your paper on time.<|eot_id|><|eot_id|>.
|
57 |
+
|
58 |
+
# #MTA
|
59 |
+
# 2.2
|
60 |
+
|
61 |
+
# Accepted Length List
|
62 |
+
# [1, 2, 1, 1, 3, 1, 2, 2, 3, 1, 2, 2, 2, 2, 2, 1, 3, 1, 3, 1, 2, 1, 3, 2, 2, 2, 1, 2, 1, 2, 3, 2, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 5, 1, 3, 1, 5, 2, 1, 3, 2, 2, 2, 3, 2, 5, 1, 3, 2, 3, 2, 3, 2, 1, 4, 3, 1, 2, 2, 3, 6, 1, 2, 2, 2, 3, 2, 2, 3, 3, 2, 3, 2, 2, 2, 1, 2, 2, 2, 3, 3, 3, 1, 4, 2, 1, 2, 2, 2]
|
63 |
+
```
|
64 |
+
|
65 |
+
Run ```sh run_case.sh``` to obtain the execution process of a test sample.
|
66 |
+
The interface design of the entire ```noise_forward``` is kept consistent with Spec-Bench.
|
67 |
+
|
68 |
+
|
69 |
+
|
70 |
+
|
71 |
+
|
72 |
+
|
73 |
+
|
74 |
+
## Citation
|
75 |
+
If you find this work is useful for your research, please cite our paper:
|
76 |
+
|
77 |
+
```
|
78 |
+
@inproceedings{wang-etal-2024-make,
|
79 |
+
title = "Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training",
|
80 |
+
author = "Wang, Yixuan and
|
81 |
+
Luo, Xianzhen and
|
82 |
+
Wei, Fuxuan and
|
83 |
+
Liu, Yijun and
|
84 |
+
Zhu, Qingfu and
|
85 |
+
Zhang, Xuanyu and
|
86 |
+
Yang, Qing and
|
87 |
+
Xu, Dongliang and
|
88 |
+
Che, Wanxiang",
|
89 |
+
editor = "Al-Onaizan, Yaser and
|
90 |
+
Bansal, Mohit and
|
91 |
+
Chen, Yun-Nung",
|
92 |
+
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
|
93 |
+
month = nov,
|
94 |
+
year = "2024",
|
95 |
+
address = "Miami, Florida, USA",
|
96 |
+
publisher = "Association for Computational Linguistics",
|
97 |
+
url = "https://aclanthology.org/2024.emnlp-main.718/",
|
98 |
+
doi = "10.18653/v1/2024.emnlp-main.718",
|
99 |
+
pages = "12914--12926",
|
100 |
+
}
|
101 |
+
```
|