RLHF-And-Friends
community
AI & ML interests
None defined yet.
Recent Activity
View all activity
Collections
4
-
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a0
Text Generation • Updated • 22 -
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-a1
Text Generation • Updated • 17 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a0
Text Generation • Updated • 16 -
RLHF-And-Friends/FedPPO-Isolated-Pythia-70M-a1
Text Generation • Updated • 22
models
30
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-test-a1
Text Generation
•
Updated
RLHF-And-Friends/FedPPO-Collaborative-Pythia-70M-test-a0
Text Generation
•
Updated
RLHF-And-Friends/FedPPO-LLama-3.2-1B-Instruct-A0
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-PPO-ultrachat_200k-LoRA-8
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-2r
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-LoRA8r
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-4r
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-16r
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-8r
Updated
RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward
Updated
datasets
None public yet