How to generate multiple tokens at once?
#14
by
Wiselnn
- opened
Nice work! I have reviewed the repository and noticed that the model is set up to output a single token by default, and the generate.py does not include the multi-token output logic as claimed in the paper. If I want to validate the effectiveness of multi-token output, what should I do? Thanks for your help!
This code is missing the self-speculative sampling part. Could you add it?