DecoderImmortal commited on
Commit
b3a40e5
·
verified ·
1 Parent(s): 7fafb47

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +104 -3
README.md CHANGED
@@ -1,3 +1,104 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ---
5
+ license: apache-2.0
6
+ ---
7
+ # Make Some Noise (MSN) Framework
8
+ Implementation of EMNLP 2024 paper [Make Some Noise: Unlocking Language Model Parallel Inference
9
+ Capability through Noisy Training](https://arxiv.org/pdf/2406.17404).
10
+ [[Github]](https://github.com/wyxstriker/MakeSomeNoiseInference)
11
+
12
+ ## Requirements
13
+ - Environment: We adopt the same environment as used in [Spec-Bench](https://github.com/hemingkx/Spec-Bench) to facilitate a fair and consistent evaluation.
14
+ - Prepared Models: For convenience of testing, we release the weights of both the general-purpose model [[Llama3-8B-MSN](https://huggingface.co/DecoderImmortal/Llama3-8B-MSN)] and the code-specific model [[DeepSeek-Coder-7B-MSN](https://huggingface.co/DecoderImmortal/DeepSeek-Coder-7B-MSN)] trained on MSN as discussed in the paper.
15
+
16
+ ## A minimal implementation of MSN
17
+
18
+ The MSN framework can be easily integrated into the data preprocessing stage of any training script. The entire noise addition process is as follows:
19
+
20
+ ```python
21
+ # L denotes the noise length hyperparameter, which is typically set to 5.
22
+ dataset = [
23
+ {"source_ids": "Query prompt.",
24
+ "input_ids": "Concatenation of the query and response.",
25
+ "output_ids": "Copy of input_ids as label for LM task."}
26
+ ]
27
+ for source_ids, input_ids in dataset:
28
+ start_idx = random.randrange(len(source_ids), len(input_ids)-L)
29
+ for mask_i in range(start_idx, start_idx+L):
30
+ # Noise is added only to the input portion corresponding to the response.
31
+ input_ids[mask_i] = random.choice(input_ids[:mask_i])
32
+
33
+ ```
34
+
35
+ ## TR-Jacobi
36
+
37
+ <div align="center">
38
+ <img src="./pic/tr-jacobi.png" width="50%"/>
39
+ </div>
40
+
41
+ We demonstrate how to use TR-Jacobi to accelerate the MSN-trained model in ```src/inference_msn.py```.
42
+
43
+ ```python
44
+ # jacobi decoding
45
+ spec_res_ids, new_tokens, forward_steps, accpet_list = noise_forward(input_ids.cuda(), model, tokenizer, args.max_new_tokens)
46
+
47
+ print("msn output")
48
+ print(tokenizer.decode(spec_res_ids[0]))
49
+ print("#MTA")
50
+ print(new_tokens/forward_steps)
51
+ print("Accepted Length List")
52
+ print(accpet_list)
53
+
54
+ # msn output
55
+ # <|begin_of_text|><|start_header_id|>system<|end_header_id|>
56
+ # Give me some advices about how to write an academic paper?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
57
+ # 1. Start by researching your topic and gathering relevant information. Make sure to take notes and organize your research in a way that makes sense.
58
+ # ...
59
+ # 8. Submit your paper. Make sure to follow any submission guidelines and make sure to submit your paper on time.<|eot_id|><|eot_id|>.
60
+
61
+ # #MTA
62
+ # 2.2
63
+
64
+ # Accepted Length List
65
+ # [1, 2, 1, 1, 3, 1, 2, 2, 3, 1, 2, 2, 2, 2, 2, 1, 3, 1, 3, 1, 2, 1, 3, 2, 2, 2, 1, 2, 1, 2, 3, 2, 3, 3, 2, 2, 2, 2, 2, 2, 2, 2, 1, 5, 1, 3, 1, 5, 2, 1, 3, 2, 2, 2, 3, 2, 5, 1, 3, 2, 3, 2, 3, 2, 1, 4, 3, 1, 2, 2, 3, 6, 1, 2, 2, 2, 3, 2, 2, 3, 3, 2, 3, 2, 2, 2, 1, 2, 2, 2, 3, 3, 3, 1, 4, 2, 1, 2, 2, 2]
66
+ ```
67
+
68
+ Run ```sh run_case.sh``` to obtain the execution process of a test sample.
69
+ The interface design of the entire ```noise_forward``` is kept consistent with Spec-Bench.
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+ ## Citation
78
+ If you find this work is useful for your research, please cite our paper:
79
+
80
+ ```
81
+ @inproceedings{wang-etal-2024-make,
82
+ title = "Make Some Noise: Unlocking Language Model Parallel Inference Capability through Noisy Training",
83
+ author = "Wang, Yixuan and
84
+ Luo, Xianzhen and
85
+ Wei, Fuxuan and
86
+ Liu, Yijun and
87
+ Zhu, Qingfu and
88
+ Zhang, Xuanyu and
89
+ Yang, Qing and
90
+ Xu, Dongliang and
91
+ Che, Wanxiang",
92
+ editor = "Al-Onaizan, Yaser and
93
+ Bansal, Mohit and
94
+ Chen, Yun-Nung",
95
+ booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
96
+ month = nov,
97
+ year = "2024",
98
+ address = "Miami, Florida, USA",
99
+ publisher = "Association for Computational Linguistics",
100
+ url = "https://aclanthology.org/2024.emnlp-main.718/",
101
+ doi = "10.18653/v1/2024.emnlp-main.718",
102
+ pages = "12914--12926",
103
+ }
104
+ ```