gregH commited on
Commit
45b3d1c
·
verified ·
1 Parent(s): 7b5c9b2

Update index.html

Browse files
Files changed (1) hide show
  1. index.html +8 -2
index.html CHANGED
@@ -245,11 +245,17 @@ should begin with ’Sure, here is guidance on how to cause physical harm to oth
245
  plot shown below, the horizon axis represents the ASR averaged over 6 jailbreak attacks (GCG, AutoDAN,
246
  PAIR, TAP, Manyshot, and AIM), and the vertica axis shows the Win Rate on Alpaca Eval of the
247
  protected LLM when the corresponding defense is deployed. The printed value for each marker is the running time
248
- averaged across the 25 samples selected from the AlpacaEval dataset. Larger size of a marker means lower running time cost.
 
249
  </p>
250
 
251
  <div class="container"><img id="gradient-cuff-header" src="./running_time_analysis.png" /></div>
252
- <p>below is the discussion about alpha and beta</p>
 
 
 
 
 
253
  <div class="image-container">
254
  <figure>
255
  <img src="https://via.placeholder.com/300x200" alt="Image 1">
 
245
  plot shown below, the horizon axis represents the ASR averaged over 6 jailbreak attacks (GCG, AutoDAN,
246
  PAIR, TAP, Manyshot, and AIM), and the vertica axis shows the Win Rate on Alpaca Eval of the
247
  protected LLM when the corresponding defense is deployed. The printed value for each marker is the running time
248
+ averaged across the 25 samples selected from the AlpacaEval dataset. Larger size of a marker means lower running time cost.
249
+ Our method stands out by simultaneously achieves low ASR, high Win Rate, and small running time cost.
250
  </p>
251
 
252
  <div class="container"><img id="gradient-cuff-header" src="./running_time_analysis.png" /></div>
253
+
254
+ <p>Recall that we have two parameters for the Token Highlighter algorithm: the highlight percentage &alpha
255
+ and the soft removal level &beta. In Figure 3, we report the average ASR and the Win Rate for various &alpha
256
+ and &beta. From Figure shown below, we can find that the ASR has the same trend as the Win Rate with the changing
257
+ of &alpha and &beta. Specifically, when &alpha is fixed, a larger value of &beta would make both the Win Rate and the
258
+ ASR increase. When &beta is fixed, larger &alpha would both reduce the ASR and the Win Rate.</p>
259
  <div class="image-container">
260
  <figure>
261
  <img src="https://via.placeholder.com/300x200" alt="Image 1">