Recommended Usage
#24
by
yarnsp
- opened
It's unclear if the Recommended Usage section applies to the distilled
models or not? also is the syntax for templating the same as the R1
or the base distilled model? do we use the llama3
and qwen2.5
templates or the deepseek-r1
templates?