None defined yet.
Evaluate if a user prompt is on-topic for a given system prompt
Multimodal search & retrieval-based biodiversity recognition
Evaluate system prompt leakage in LLM output