Release of Training Data?

#15
by RylanSchaeffer - opened

The Model Card reads: "We use a mix of prompts that come from the Anthropic dataset and redteaming examples that we have collected in house, in a separate process from our production redteaming. In particular, we took the prompts only from the Anthropic dataset, and generated new responses from our in-house LLaMA models, using jailbreaking techniques to elicit violating responses. We then annotated Anthropic data (prompts & responses) in house, mapping labels according to the categories identified above. Overall we have ~13K training examples."

Is this dataset going to be released sometime? I'd like to use it for a research project.

Sign up or log in to comment