Compact Taxonomy with Hierarchical Levels
A large-scale LLM safety evaluation with hierarchical taxonomy, including 6 domain-level categories, 16 task-level categories, and 65 detailed classes.
Enhanced Difficulty and Complexity
We propose the enhancement procedure to enhance the difficulty and complexity, including attack-enhanced subset, defense-enhanced subset, and multiple-choice questions subset.
Reliable and Seamless Evaluator
For QA pairs, we propose the MD-Judge, a fine-tuned LLM designed to assess question-answer pairs. And for MCQs, we propose the MCQ-Judge, which employs in-context learning and regex parsing to identify unsafe choices from the options provided.
Joint-Purpose Utility
Comprehensive evaluation for LLMs, attacking methods and defense methods.
Leaderboard
