Pre-trained model fine-tuned using Reinforcement Learning on DIALOCONAN dataset using facebook/roberta-hate-speech-dynabench-r4-target as reward model.
Toxicity results on allenai/real-toxicity-prompts dataset using custom prompts (see π₯RewardLM for details).
Toxicity Level | RedPajama-INCITE-Chat-3B |
---|---|
Pre-Trained | 0.217 |
Fine-Tuned | 0.129 |
RL | 0.160 |
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support
HF Inference deployability: The model has no library tag.