Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
dmariko
/
SmolLM-1.7B-Instruct-dpo-16k
like
0
TensorBoard
Safetensors
English
llama
trl
dpo
Generated from Trainer
License:
cc-by-nc-4.0
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
284fcaa
SmolLM-1.7B-Instruct-dpo-16k
Commit History
Update README.md
284fcaa
verified
dmariko
commited on
Sep 12, 2024
Upload tokenizer
5cd1246
verified
dmariko
commited on
Sep 12, 2024
Upload LlamaForCausalLM
00e8ac3
verified
dmariko
commited on
Sep 12, 2024
SmolLM-1.7B-Instruct-dpo-16k
7511c1d
verified
dmariko
commited on
Sep 12, 2024
initial commit
871483c
verified
dmariko
commited on
Sep 12, 2024