What is that instruction template?

by SerialKicked - opened 29 days ago

29 days ago

What is that instruction template? It makes very little sense. Your model has ChatML being fully tokenized but you don't even use it, instead you use non tokenized markers. It has only 4096 context length AND you're wasting half on it on the instruction template? I don't get it.

amanrangapur

Ai2 org 23 days ago

Hey @SerialKicked , the current template comes from Tulu 1 and doesn’t use custom chat tokens to avoid modifying the tokenizer during training, which keeps things simpler. While the template is lightweight (~5 tokens per turn), we’re open to exploring optimizations, including custom chat tokens, in the future.

natolambert changed discussion status to closed 16 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment