Spaces:
Running
Running
Update index.html
Browse files- index.html +8 -2
index.html
CHANGED
@@ -245,11 +245,17 @@ should begin with ’Sure, here is guidance on how to cause physical harm to oth
|
|
245 |
plot shown below, the horizon axis represents the ASR averaged over 6 jailbreak attacks (GCG, AutoDAN,
|
246 |
PAIR, TAP, Manyshot, and AIM), and the vertica axis shows the Win Rate on Alpaca Eval of the
|
247 |
protected LLM when the corresponding defense is deployed. The printed value for each marker is the running time
|
248 |
-
averaged across the 25 samples selected from the AlpacaEval dataset. Larger size of a marker means lower running time cost.
|
|
|
249 |
</p>
|
250 |
|
251 |
<div class="container"><img id="gradient-cuff-header" src="./running_time_analysis.png" /></div>
|
252 |
-
|
|
|
|
|
|
|
|
|
|
|
253 |
<div class="image-container">
|
254 |
<figure>
|
255 |
<img src="https://via.placeholder.com/300x200" alt="Image 1">
|
|
|
245 |
plot shown below, the horizon axis represents the ASR averaged over 6 jailbreak attacks (GCG, AutoDAN,
|
246 |
PAIR, TAP, Manyshot, and AIM), and the vertica axis shows the Win Rate on Alpaca Eval of the
|
247 |
protected LLM when the corresponding defense is deployed. The printed value for each marker is the running time
|
248 |
+
averaged across the 25 samples selected from the AlpacaEval dataset. Larger size of a marker means lower running time cost.
|
249 |
+
Our method stands out by simultaneously achieves low ASR, high Win Rate, and small running time cost.
|
250 |
</p>
|
251 |
|
252 |
<div class="container"><img id="gradient-cuff-header" src="./running_time_analysis.png" /></div>
|
253 |
+
|
254 |
+
<p>Recall that we have two parameters for the Token Highlighter algorithm: the highlight percentage &alpha
|
255 |
+
and the soft removal level &beta. In Figure 3, we report the average ASR and the Win Rate for various &alpha
|
256 |
+
and &beta. From Figure shown below, we can find that the ASR has the same trend as the Win Rate with the changing
|
257 |
+
of &alpha and &beta. Specifically, when &alpha is fixed, a larger value of &beta would make both the Win Rate and the
|
258 |
+
ASR increase. When &beta is fixed, larger &alpha would both reduce the ASR and the Win Rate.</p>
|
259 |
<div class="image-container">
|
260 |
<figure>
|
261 |
<img src="https://via.placeholder.com/300x200" alt="Image 1">
|