Fix: add <eos> token at the end of chat template

#12
by adhi29 - opened

When we try to generate training data for Instruction Fine-Tuning for the code gemma instruct model using the "apply_chat_template" function, the jinja template doesn't add token at the end of generation. This results in the model not-learning/unlearning the concept of end_of_sentence tokens.

The ideal behavior of chat template is to generate templates for training if "add_generation_prompt" = False and "continue_final_message" = False. But that's not the case here. This new tokenizer_config.json file fixes that problem.

adhi29 changed pull request title from Upload tokenizer_config.json to Fix: add <eos> token at the end of chat template
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment