gregH commited on
Commit
b928cb0
·
verified ·
1 Parent(s): 8b5d257

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +6 -1
index.html CHANGED
@@ -171,7 +171,7 @@ We provide more details about the running flow of Gradient Cuff in the paper.
171
 
172
  <h2 id="demonstration">Demonstration</h2>
173
  <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder)
174
- against 6 different jailbreak attacks (<a href=“#tabs-1">GCG</a>, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and
175
  Vicuna-7B-V1.5). We below demonstrate the average refusal rate across these 6 malicious user query datasets as the Average Malicious Refusal
176
  Rate and the refusal rate on benign user queries as the Benign Refusal Rate. The defending performance against different jailbreak types is
177
  shown in the provided bar chart.
@@ -223,6 +223,11 @@ We provide more details about the running flow of Gradient Cuff in the paper.
223
  </div>
224
  We summarized some key points of the mentioned jailbreak attacks or defenses in the below tables.
225
  <div id="tabs">
 
 
 
 
 
226
  <div id="tabs-1">
227
  <p>Proin elit arcu, rutrum commodo, vehicula tempus, commodo a, risus. Curabitur nec arcu. Donec sollicitudin mi sit amet mauris. Nam elementum quam ullamcorper ante. Etiam aliquet massa et lorem. Mauris dapibus lacus auctor risus. Aenean tempor ullamcorper leo. Vivamus sed magna quis ligula eleifend adipiscing. Duis orci. Aliquam sodales tortor vitae ipsum. Aliquam nulla. Duis aliquam molestie erat. Ut et mauris vel pede varius sollicitudin. Sed ut dolor nec orci tincidunt interdum. Phasellus ipsum. Nunc tristique tempus lectus.</p>
228
  </div>
 
171
 
172
  <h2 id="demonstration">Demonstration</h2>
173
  <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder)
174
+ against 6 different jailbreak attacks (GCG, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and
175
  Vicuna-7B-V1.5). We below demonstrate the average refusal rate across these 6 malicious user query datasets as the Average Malicious Refusal
176
  Rate and the refusal rate on benign user queries as the Benign Refusal Rate. The defending performance against different jailbreak types is
177
  shown in the provided bar chart.
 
223
  </div>
224
  We summarized some key points of the mentioned jailbreak attacks or defenses in the below tables.
225
  <div id="tabs">
226
+ <ul>
227
+ <li><a href="#tabs-1">Nunc tincidunt</a></li>
228
+ <li><a href="#tabs-2">Proin dolor</a></li>
229
+ <li><a href="#tabs-3">Aenean lacinia</a></li>
230
+ </ul>
231
  <div id="tabs-1">
232
  <p>Proin elit arcu, rutrum commodo, vehicula tempus, commodo a, risus. Curabitur nec arcu. Donec sollicitudin mi sit amet mauris. Nam elementum quam ullamcorper ante. Etiam aliquet massa et lorem. Mauris dapibus lacus auctor risus. Aenean tempor ullamcorper leo. Vivamus sed magna quis ligula eleifend adipiscing. Duis orci. Aliquam sodales tortor vitae ipsum. Aliquam nulla. Duis aliquam molestie erat. Ut et mauris vel pede varius sollicitudin. Sed ut dolor nec orci tincidunt interdum. Phasellus ipsum. Nunc tristique tempus lectus.</p>
233
  </div>