-
-
-
-
-
-
Inference Providers
Active filters:
dpo
sfulay/zephyr-7b-dpo-full-gpt_consistent-reward-scale-1-rpo-gamma-05
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-group-dpo
Text Generation
•
Updated
•
8
mradermacher/zephyr-7b-hh-dpo-i1-GGUF
Updated
•
246
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-dpo-2
Text Generation
•
Updated
•
6
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-coding-dpo-2
Text Generation
•
Updated
•
8
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-math-coding-dpo-2
Text Generation
•
Updated
•
6
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-logic-dpo-2
Text Generation
•
Updated
•
4
NicholasCorrado/tinyllama-1.1b-chat-v1.0-ui-dpo-2
Text Generation
•
Updated
•
8
tsavage68/Na_L3_1000steps_1e6rate_03beta_cSFTDPO
Text Generation
•
Updated
•
6
NicholasCorrado/tinyllama-1.1b-chat-v1.0-rlced-conifer-3-1-dpo
Text Generation
•
Updated
•
4
NicholasCorrado/tulu-2-7b-rlced-conifer-dpo
Text Generation
•
Updated
•
7
tsavage68/Na_L3_1000steps_1e6rate_01beta_cSFTDPO
Text Generation
•
Updated
•
5
NanQiangHF/llama3.1_8b_dpo_bwgenerator
tsavage68/Na_L3_150steps_1e6rate_01beta_cSFTDPO
Text Generation
•
Updated
•
4
tsavage68/Na_L3_100steps_1e6rate_03beta_cSFTDPO
Text Generation
•
Updated
•
4
NicholasCorrado/zephyr-7b-uf-rlced-conifer-dpo-2e
Text Generation
•
Updated
•
9
tsavage68/Na_L3_1000steps_1e6rate_05beta_cSFTDPO
Text Generation
•
Updated
•
6
tsavage68/Na_L3_100steps_1e6rate_05beta_cSFTDPO
Text Generation
•
Updated
•
4
CultriX/Lama-DPOlphin-8B-Q3_K_S-GGUF
Text Generation
•
Updated
•
11
•
1
CultriX/Lama-DPOlphin-8B-Q4_K_S-GGUF
Text Generation
•
Updated
•
11
•
1
QuantFactory/Fireball-3.1-8B-ORPO-GGUF
Text Generation
•
Updated
•
36
•
2
mradermacher/Lama-DPOlphin-8B-GGUF
Updated
•
247
•
1
tsavage68/Na_L3_1000steps_1e7rate_01beta_cSFTDPO
Text Generation
•
Updated
•
4
tsavage68/Na_L3_1000steps_1e7rate_03beta_cSFTDPO
Text Generation
•
Updated
•
4
tsavage68/Na_L3_350steps_1e7rate_01beta_cSFTDPO
Text Generation
•
Updated
•
7
tsavage68/Na_L3_250steps_1e7rate_03beta_cSFTDPO
Text Generation
•
Updated
•
4
tsavage68/Na_L3_1000steps_1e7rate_05beta_cSFTDPO
Text Generation
•
Updated
•
4
tsavage68/Na_L3_350steps_1e7rate_05beta_cSFTDPO
Text Generation
•
Updated
•
4
mradermacher/Lama-DPOlphin-8B-i1-GGUF
Updated
•
274
•
1
tsavage68/Na_M2_1000steps_1e7rate_01beta_cSFTDPO
Text Generation
•
Updated
•
8