Token-Highlighter

Running

App Files Files Community

gregH commited on Feb 29, 2024

Commit

b928cb0

verified ·

1 Parent(s): 8b5d257

Update index.html

Browse files

Files changed (1) hide show

index.html +6 -1

index.html CHANGED Viewed

@@ -171,7 +171,7 @@ We provide more details about the running flow of Gradient Cuff in the paper.
 <h2 id="demonstration">Demonstration</h2>
 <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder)
-  against 6 different jailbreak attacks (<a href=“#tabs-1">GCG</a>, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and
   Vicuna-7B-V1.5). We below demonstrate the average refusal rate across these 6 malicious user query datasets as the Average Malicious Refusal
   Rate and the refusal rate on benign user queries as the Benign Refusal Rate. The defending performance against different jailbreak types is
   shown in the provided bar chart.
@@ -223,6 +223,11 @@ We provide more details about the running flow of Gradient Cuff in the paper.
 </div>
 We summarized some key points of the mentioned jailbreak attacks or defenses in the below tables.
 <div id="tabs">
   <div id="tabs-1">
     <p>Proin elit arcu, rutrum commodo, vehicula tempus, commodo a, risus. Curabitur nec arcu. Donec sollicitudin mi sit amet mauris. Nam elementum quam ullamcorper ante. Etiam aliquet massa et lorem. Mauris dapibus lacus auctor risus. Aenean tempor ullamcorper leo. Vivamus sed magna quis ligula eleifend adipiscing. Duis orci. Aliquam sodales tortor vitae ipsum. Aliquam nulla. Duis aliquam molestie erat. Ut et mauris vel pede varius sollicitudin. Sed ut dolor nec orci tincidunt interdum. Phasellus ipsum. Nunc tristique tempus lectus.</p>
   </div>

 <h2 id="demonstration">Demonstration</h2>
 <p>We evaluated Gradient Cuff as well as 4 baselines (Perplexity Filter, SmoothLLM, Erase-and-Check, and Self-Reminder)
+  against 6 different jailbreak attacks (GCG, AutoDAN, PAIR, TAP, Base64, and LRL) and benign user queries on 2 LLMs (LLaMA-2-7B-Chat and
   Vicuna-7B-V1.5). We below demonstrate the average refusal rate across these 6 malicious user query datasets as the Average Malicious Refusal
   Rate and the refusal rate on benign user queries as the Benign Refusal Rate. The defending performance against different jailbreak types is
   shown in the provided bar chart.
 </div>
 We summarized some key points of the mentioned jailbreak attacks or defenses in the below tables.
 <div id="tabs">
+  <ul>
+    <li><a href="#tabs-1">Nunc tincidunt</a></li>
+    <li><a href="#tabs-2">Proin dolor</a></li>
+    <li><a href="#tabs-3">Aenean lacinia</a></li>
+  </ul>
   <div id="tabs-1">
     <p>Proin elit arcu, rutrum commodo, vehicula tempus, commodo a, risus. Curabitur nec arcu. Donec sollicitudin mi sit amet mauris. Nam elementum quam ullamcorper ante. Etiam aliquet massa et lorem. Mauris dapibus lacus auctor risus. Aenean tempor ullamcorper leo. Vivamus sed magna quis ligula eleifend adipiscing. Duis orci. Aliquam sodales tortor vitae ipsum. Aliquam nulla. Duis aliquam molestie erat. Ut et mauris vel pede varius sollicitudin. Sed ut dolor nec orci tincidunt interdum. Phasellus ipsum. Nunc tristique tempus lectus.</p>
   </div>