Llama-3.2-3B-DPO-Math - a RLHF-And-Friends Collection

RLHF-And-Friends 's Collections

Llama-3.2-3B-DPO-Math

Llama-3

Llama-3.2-3B-DPO-Math

updated Nov 8

RLHF-And-Friends/Llama-3.2-3B-Instruct-DPO-Math

Text Generation • Updated Nov 8 • 409
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math-SF

Text Generation • Updated Nov 8 • 7
RLHF-And-Friends/Llama-3.2-3B-Instruct-BnB-4bit-DPO-Math

Updated Nov 8 • 3