Update index.html
Browse files- index.html +20 -10
index.html
CHANGED
@@ -29,18 +29,23 @@
|
|
29 |
<p><sup>1</sup> Huawei Noah's Ark Lab,
|
30 |
<sup>2</sup> University of Liverpool,
|
31 |
<sup>3</sup> King's College London</p>
|
32 |
-
<p><a
|
33 |
-
<a
|
34 |
-
<a
|
35 |
-
<a
|
36 |
-
<a
|
37 |
-
<a
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
</div>
|
39 |
|
40 |
-
<figure class="image text-center">
|
41 |
-
<img alt="APA activation" src="https://huggingface.co/spaces/konsa15/AGLU/resolve/main/assets/unified_activations_combined.jpg">
|
42 |
-
<figcaption> Figure 1: APA unifies most activation functions under the same formula.</figcaption>
|
43 |
-
</figure>
|
44 |
|
45 |
<h3>Abstract</h3>
|
46 |
|
@@ -50,10 +55,15 @@
|
|
50 |
<p>The Adaptive Parametric Activation APA is defined as: <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mi>P</mi><mi>A</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo><mo>=</mo><mo stretchy="false">(</mo><mi>λ</mi><mi>e</mi><mi>x</mi><mi>p</mi><mo stretchy="false">(</mo><mo>−</mo><mi>κ</mi><mi>z</mi><mo stretchy="false">)</mo><mo>+</mo><mn>1</mn><msup><mo stretchy="false">)</mo><mfrac><mn>1</mn><mrow><mo>−</mo><mi>λ</mi></mrow></mfrac></msup></mrow>APA(z,λ,κ) = (λ exp(−κz) + 1) ^{\frac{1}{−λ}}</math></span><span aria-hidden="true" class="katex-html"><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mord mathnormal">A</span><span style="margin-right:0.13889em;" class="mord mathnormal">P</span><span class="mord mathnormal">A</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span><span style="margin-right:0.2778em;" class="mspace"></span><span class="mrel">=</span><span style="margin-right:0.2778em;" class="mspace"></span></span><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mopen">(</span><span class="mord mathnormal">λ</span><span class="mord mathnormal">e</span><span class="mord mathnormal">x</span><span class="mord mathnormal">p</span><span class="mopen">(</span><span class="mord">−</span><span class="mord mathnormal">κ</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mclose">)</span><span style="margin-right:0.2222em;" class="mspace"></span><span class="mbin">+</span><span style="margin-right:0.2222em;" class="mspace"></span></span><span class="base"><span style="height:1.2312em;vertical-align:-0.25em;" class="strut"></span><span class="mord">1</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span style="height:0.9812em;" class="vlist"><span style="top:-3.3902em;margin-right:0.05em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mopen nulldelimiter sizing reset-size3 size6"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span style="height:0.8443em;" class="vlist"><span style="top:-2.656em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mathnormal mtight">λ</span></span></span></span><span style="top:-3.2255em;"><span style="height:3em;" class="pstrut"></span><span style="border-bottom-width:0.049em;" class="frac-line mtight"></span></span><span style="top:-3.384em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span style="height:0.4035em;" class="vlist"><span></span></span></span></span></span><span class="mclose nulldelimiter sizing reset-size3 size6"></span></span></span></span></span></span></span></span></span></span></span></span></span>. APA unifies most activation functions under the same formula as shwon in Figure 1.</p>
|
51 |
<p>APA can be used insed the intermediate layers using Adaptive Generalised Linear Unit (AGLU): <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mi>G</mi><mi>L</mi><mi>U</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo><mo>=</mo><mi>z</mi><mi>A</mi><mi>P</mi><mi>A</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo></mrow>AGLU(z,λ,κ) = z APA(z,λ,κ)</math></span><span aria-hidden="true" class="katex-html"><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mord mathnormal">A</span><span class="mord mathnormal">G</span><span style="margin-right:0.10903em;" class="mord mathnormal">LU</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span><span style="margin-right:0.2778em;" class="mspace"></span><span class="mrel">=</span><span style="margin-right:0.2778em;" class="mspace"></span></span><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mord mathnormal">A</span><span style="margin-right:0.13889em;" class="mord mathnormal">P</span><span class="mord mathnormal">A</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span></span></span></span>.
|
52 |
The derivatives of AGLU with respect to κ (top), λ (middle) and z (bottom) are shown in Figure 2:</p>
|
|
|
|
|
|
|
53 |
<figure class="image text-center">
|
54 |
<img alt="AGLU derivatives" src="https://huggingface.co/spaces/konsa15/AGLU/resolve/main/assets/derivative_visualisations.jpg">
|
55 |
<figcaption> Figure 2: The derivatives of AGLU with respect to κ (top), λ (middle) and z (bottom).</figcaption>
|
56 |
</figure>
|
|
|
|
|
57 |
|
58 |
|
59 |
<h3> Simple Code implementation </h3>
|
|
|
29 |
<p><sup>1</sup> Huawei Noah's Ark Lab,
|
30 |
<sup>2</sup> University of Liverpool,
|
31 |
<sup>3</sup> King's College London</p>
|
32 |
+
<p><a href="https://link.springer.com/chapter/10.1007/978-3-031-72949-2_26"><img alt="Static Badge" src="https://img.shields.io/badge/ECCV_2024-APA-blue"></a>
|
33 |
+
<a href="https://arxiv.org/pdf/2407.08567"><img alt="Static Badge" src="https://img.shields.io/badge/arxiv-2407.08567-blue"></a>
|
34 |
+
<a href="https://paperswithcode.com/sota/long-tail-learning-on-places-lt?p=adaptive-parametric-activation"><img alt="PWC" src="https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/adaptive-parametric-activation/long-tail-learning-on-places-lt"></a>
|
35 |
+
<a href="https://paperswithcode.com/sota/instance-segmentation-on-lvis-v1-0-val?p=adaptive-parametric-activation"><img alt="PWC" src="https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/adaptive-parametric-activation/instance-segmentation-on-lvis-v1-0-val"></a>
|
36 |
+
<a href="https://paperswithcode.com/sota/long-tail-learning-on-imagenet-lt?p=adaptive-parametric-activation"><img alt="PWC" src="https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/adaptive-parametric-activation/long-tail-learning-on-imagenet-lt"></a>
|
37 |
+
<a href="https://paperswithcode.com/sota/long-tail-learning-on-inaturalist-2018?p=adaptive-parametric-activation"><img alt="PWC" src="https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/adaptive-parametric-activation/long-tail-learning-on-inaturalist-2018"></a></p>
|
38 |
+
</div>
|
39 |
+
|
40 |
+
<div class="row">
|
41 |
+
<div class="col-sm">
|
42 |
+
<figure class="image text-center">
|
43 |
+
<img alt="APA activation" src="https://huggingface.co/spaces/konsa15/AGLU/resolve/main/assets/unified_activations_combined.jpg">
|
44 |
+
<figcaption> Figure 1: APA unifies most activation functions under the same formula.</figcaption>
|
45 |
+
</figure>
|
46 |
+
</div>
|
47 |
</div>
|
48 |
|
|
|
|
|
|
|
|
|
49 |
|
50 |
<h3>Abstract</h3>
|
51 |
|
|
|
55 |
<p>The Adaptive Parametric Activation APA is defined as: <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mi>P</mi><mi>A</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo><mo>=</mo><mo stretchy="false">(</mo><mi>λ</mi><mi>e</mi><mi>x</mi><mi>p</mi><mo stretchy="false">(</mo><mo>−</mo><mi>κ</mi><mi>z</mi><mo stretchy="false">)</mo><mo>+</mo><mn>1</mn><msup><mo stretchy="false">)</mo><mfrac><mn>1</mn><mrow><mo>−</mo><mi>λ</mi></mrow></mfrac></msup></mrow>APA(z,λ,κ) = (λ exp(−κz) + 1) ^{\frac{1}{−λ}}</math></span><span aria-hidden="true" class="katex-html"><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mord mathnormal">A</span><span style="margin-right:0.13889em;" class="mord mathnormal">P</span><span class="mord mathnormal">A</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span><span style="margin-right:0.2778em;" class="mspace"></span><span class="mrel">=</span><span style="margin-right:0.2778em;" class="mspace"></span></span><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mopen">(</span><span class="mord mathnormal">λ</span><span class="mord mathnormal">e</span><span class="mord mathnormal">x</span><span class="mord mathnormal">p</span><span class="mopen">(</span><span class="mord">−</span><span class="mord mathnormal">κ</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mclose">)</span><span style="margin-right:0.2222em;" class="mspace"></span><span class="mbin">+</span><span style="margin-right:0.2222em;" class="mspace"></span></span><span class="base"><span style="height:1.2312em;vertical-align:-0.25em;" class="strut"></span><span class="mord">1</span><span class="mclose"><span class="mclose">)</span><span class="msupsub"><span class="vlist-t"><span class="vlist-r"><span style="height:0.9812em;" class="vlist"><span style="top:-3.3902em;margin-right:0.05em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size6 size3 mtight"><span class="mord mtight"><span class="mord mtight"><span class="mopen nulldelimiter sizing reset-size3 size6"></span><span class="mfrac"><span class="vlist-t vlist-t2"><span class="vlist-r"><span style="height:0.8443em;" class="vlist"><span style="top:-2.656em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">−</span><span class="mord mathnormal mtight">λ</span></span></span></span><span style="top:-3.2255em;"><span style="height:3em;" class="pstrut"></span><span style="border-bottom-width:0.049em;" class="frac-line mtight"></span></span><span style="top:-3.384em;"><span style="height:3em;" class="pstrut"></span><span class="sizing reset-size3 size1 mtight"><span class="mord mtight"><span class="mord mtight">1</span></span></span></span></span><span class="vlist-s"></span></span><span class="vlist-r"><span style="height:0.4035em;" class="vlist"><span></span></span></span></span></span><span class="mclose nulldelimiter sizing reset-size3 size6"></span></span></span></span></span></span></span></span></span></span></span></span></span>. APA unifies most activation functions under the same formula as shwon in Figure 1.</p>
|
56 |
<p>APA can be used insed the intermediate layers using Adaptive Generalised Linear Unit (AGLU): <span class="katex"><span class="katex-mathml"><math xmlns="http://www.w3.org/1998/Math/MathML"><mrow><mi>A</mi><mi>G</mi><mi>L</mi><mi>U</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo><mo>=</mo><mi>z</mi><mi>A</mi><mi>P</mi><mi>A</mi><mo stretchy="false">(</mo><mi>z</mi><mo separator="true">,</mo><mi>λ</mi><mo separator="true">,</mo><mi>κ</mi><mo stretchy="false">)</mo></mrow>AGLU(z,λ,κ) = z APA(z,λ,κ)</math></span><span aria-hidden="true" class="katex-html"><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span class="mord mathnormal">A</span><span class="mord mathnormal">G</span><span style="margin-right:0.10903em;" class="mord mathnormal">LU</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span><span style="margin-right:0.2778em;" class="mspace"></span><span class="mrel">=</span><span style="margin-right:0.2778em;" class="mspace"></span></span><span class="base"><span style="height:1em;vertical-align:-0.25em;" class="strut"></span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mord mathnormal">A</span><span style="margin-right:0.13889em;" class="mord mathnormal">P</span><span class="mord mathnormal">A</span><span class="mopen">(</span><span style="margin-right:0.04398em;" class="mord mathnormal">z</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">λ</span><span class="mpunct">,</span><span style="margin-right:0.1667em;" class="mspace"></span><span class="mord mathnormal">κ</span><span class="mclose">)</span></span></span></span>.
|
57 |
The derivatives of AGLU with respect to κ (top), λ (middle) and z (bottom) are shown in Figure 2:</p>
|
58 |
+
|
59 |
+
<div class="row">
|
60 |
+
<div class="col-sm">
|
61 |
<figure class="image text-center">
|
62 |
<img alt="AGLU derivatives" src="https://huggingface.co/spaces/konsa15/AGLU/resolve/main/assets/derivative_visualisations.jpg">
|
63 |
<figcaption> Figure 2: The derivatives of AGLU with respect to κ (top), λ (middle) and z (bottom).</figcaption>
|
64 |
</figure>
|
65 |
+
</div>
|
66 |
+
</div>
|
67 |
|
68 |
|
69 |
<h3> Simple Code implementation </h3>
|