Spaces:
Runtime error
Runtime error
caesar-one
commited on
Commit
·
1f93c1a
1
Parent(s):
dfb91d7
Small improvements.
Browse files
README.md
CHANGED
@@ -15,23 +15,23 @@ license: apache-2.0
|
|
15 |
Italian leaderboard
|
16 |
|
17 |
## Leaderboard
|
18 |
-
| Model Name | Year | Publisher | Num.
|
19 |
-
|
20 |
-
| [DanteLLM](https://huggingface.co/rstless-research/DanteLLM-7B-Instruct-Italian-v0.1-GGUF) | 2023 | RSTLess (Sapienza University of Rome) | 7B
|
21 |
-
| [OpenDanteLLM](https://huggingface.co/rstless-research/) | 2023 | RSTLess (Sapienza University of Rome) | 7B
|
22 |
-
| [Mistral v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | 2023 | Mistral AI | 7B
|
23 |
-
| [LLaMAntino](https://huggingface.co/swap-uniba/LLaMAntino-2-7b-hf-ITA) | 2024 | Bari University | 7B
|
24 |
-
| [Fauno2](https://huggingface.co/andreabac3/Fauno2-LLaMa2-7B) | 2023 | RSTLess (Sapienza University of Rome) | 7B
|
25 |
-
| [Fauno1](https://huggingface.co/andreabac3/Fauno2-LLaMa2-7B) | 2023 | RSTLess (Sapienza University of Rome) | 7B
|
26 |
-
| [Camoscio](https://huggingface.co/teelinsan/camoscio-7b-llama) | 2023 | Gladia (Sapienza University of Rome) | 7B
|
27 |
-
| [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b) | 2022 | Meta | 7B
|
28 |
-
| [BloomZ](https://huggingface.co/bigscience/bloomz-7b1) | 2022 | BigScience | 7B
|
29 |
-
| [iT5](https://huggingface.co/gsarti/it5-large) | 2022 | Groningen University | 738M
|
30 |
-
| [GePpeTto](https://huggingface.co/LorenzoDeMattei/GePpeTto) | 2020 | Pisa/Groningen University, FBK, Aptus.AI | 117M
|
31 |
-
| [mT5](https://huggingface.co/google/mt5-large) | 2020 | Google | 3.7B
|
32 |
-
| [Minerva 3B](https://huggingface.co/sapienzanlp/Minerva-3B-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 3B
|
33 |
-
| [Minerva 1B](https://huggingface.co/sapienzanlp/Minerva-1B-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 1B
|
34 |
-
| [Minerva 350M](https://huggingface.co/sapienzanlp/Minerva-350M-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 350M
|
35 |
|
36 |
## Benchmarks
|
37 |
|
|
|
15 |
Italian leaderboard
|
16 |
|
17 |
## Leaderboard
|
18 |
+
| Model Name | Year | Publisher | Num. Params | Lang. | Avg. | Avg. (Zero-shot) | Avg. (N-shot) | MMLU (0-shot) | MMLU (5-shot) | ARC Challenge (0-shot) | ARC Challenge (25-shot) | HellaSwag (0-shot) | HellaSwag (10-shot) | TruthfulQA (0-shot) |
|
19 |
+
|--------------------------------------------------------------------------------------------|------|-------------------------------------------|-------------|--------------|-------|------------------|---------------|---------------|---------------|------------------------|-------------------------|--------------------|---------------------|-------------------------|
|
20 |
+
| [DanteLLM](https://huggingface.co/rstless-research/DanteLLM-7B-Instruct-Italian-v0.1-GGUF) | 2023 | RSTLess (Sapienza University of Rome) | 7B | Italian FT | 47.52 | 47.34 | 47.69 | 47.05 | 48.27 | 41.89 | 47.01 | 47.99 | 47.79 | 52.41 |
|
21 |
+
| [OpenDanteLLM](https://huggingface.co/rstless-research/) | 2023 | RSTLess (Sapienza University of Rome) | 7B | Italian FT | 45.97 | 45.13 | 46.80 | 44.25 | 46.89 | 41.72 | 46.76 | 46.49 | 46.75 | 48.06 |
|
22 |
+
| [Mistral v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | 2023 | Mistral AI | 7B | English | 44.29 | 45.15 | 43.43 | 44.66 | 45.84 | 37.46 | 41.47 | 43.48 | 42.99 | 54.99 |
|
23 |
+
| [LLaMAntino](https://huggingface.co/swap-uniba/LLaMAntino-2-7b-hf-ITA) | 2024 | Bari University | 7B | Italian FT | 41.66 | 40.86 | 42.46 | 33.89 | 38.74 | 38.22 | 41.72 | 46.30 | 46.91 | 45.03 |
|
24 |
+
| [Fauno2](https://huggingface.co/andreabac3/Fauno2-LLaMa2-7B) | 2023 | RSTLess (Sapienza University of Rome) | 7B | Italian FT | 41.74 | 42.90 | 40.57 | 40.30 | 38.32 | 36.26 | 39.33 | 44.25 | 44.07 | 50.77 |
|
25 |
+
| [Fauno1](https://huggingface.co/andreabac3/Fauno2-LLaMa2-7B) | 2023 | RSTLess (Sapienza University of Rome) | 7B | Italian FT | 36.91 | 37.20 | 36.61 | 28.79 | 30.45 | 33.10 | 36.52 | 43.13 | 42.86 | 43.78 |
|
26 |
+
| [Camoscio](https://huggingface.co/teelinsan/camoscio-7b-llama) | 2023 | Gladia (Sapienza University of Rome) | 7B | Italian FT | 37.22 | 38.01 | 36.42 | 30.53 | 29.38 | 33.28 | 36.60 | 42.91 | 43.29 | 45.33 |
|
27 |
+
| [LLaMA2](https://huggingface.co/meta-llama/Llama-2-7b) | 2022 | Meta | 7B | English | 39.50 | 39.14 | 39.86 | 34.12 | 37.91 | 33.28 | 37.71 | 44.31 | 43.97 | 44.83 |
|
28 |
+
| [BloomZ](https://huggingface.co/bigscience/bloomz-7b1) | 2022 | BigScience | 7B | Multilingual | 33.97 | 36.01 | 31.93 | 36.40 | 31.67 | 27.30 | 28.24 | 34.83 | 35.88 | 45.52 |
|
29 |
+
| [iT5](https://huggingface.co/gsarti/it5-large) | 2022 | Groningen University | 738M | Italian | 29.27 | 32.42 | 26.11 | 23.69 | 24.31 | 27.39 | 27.99 | 28.11 | 26.04 | 50.49 |
|
30 |
+
| [GePpeTto](https://huggingface.co/LorenzoDeMattei/GePpeTto) | 2020 | Pisa/Groningen University, FBK, Aptus.AI | 117M | Italian | 27.86 | 30.89 | 24.82 | 22.87 | 24.39 | 24.15 | 25.08 | 26.34 | 24.99 | 50.20 |
|
31 |
+
| [mT5](https://huggingface.co/google/mt5-large) | 2020 | Google | 3.7B | Multilingual | 29.00 | 30.99 | 27.01 | 25.56 | 25.60 | 25.94 | 27.56 | 26.96 | 27.86 | 45.50 |
|
32 |
+
| [Minerva 3B](https://huggingface.co/sapienzanlp/Minerva-3B-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 3B | Multilingual | 33.94 | 34.37 | 33.52 | 24.62 | 26.50 | 30.29 | 30.89 | 42.38 | 43.16 | 40.18 |
|
33 |
+
| [Minerva 1B](https://huggingface.co/sapienzanlp/Minerva-1B-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 1B | Multilingual | 29.78 | 31.46 | 28.09 | 24.69 | 24.94 | 24.32 | 25.25 | 34.01 | 34.07 | 42.84 |
|
34 |
+
| [Minerva 350M](https://huggingface.co/sapienzanlp/Minerva-350M-base-v1.0) | 2024 | SapienzaNLP (Sapienza University of Rome) | 350M | Multilingual | 28.35 | 30.72 | 26 | 23.10 | 24.29 | 23.21 | 24.32 | 29.33 | 29.37 | 47.23 |
|
35 |
|
36 |
## Benchmarks
|
37 |
|
main.py
CHANGED
@@ -6,7 +6,7 @@ import streamlit as st
|
|
6 |
from pandas.api.types import is_bool_dtype, is_datetime64_any_dtype, is_numeric_dtype
|
7 |
|
8 |
GITHUB_URL = "https://github.com/RSTLess-research/"
|
9 |
-
NON_BENCHMARK_COLS = ["
|
10 |
|
11 |
|
12 |
def extract_table_and_format_from_markdown_text(markdown_table: str) -> pd.DataFrame:
|
@@ -247,7 +247,6 @@ def setup_leaderboard(readme: str):
|
|
247 |
leaderboard_table = extract_markdown_table_from_multiline(readme, table_headline="## Leaderboard")
|
248 |
leaderboard_table = remove_markdown_links(leaderboard_table)
|
249 |
df_leaderboard = extract_table_and_format_from_markdown_text(leaderboard_table)
|
250 |
-
df_leaderboard["Open?"] = df_leaderboard["Open?"].map({"yes": 1, "no": 0}).astype(bool)
|
251 |
|
252 |
st.markdown("## Leaderboard")
|
253 |
modify = st.checkbox("Add filters")
|
@@ -257,11 +256,12 @@ def setup_leaderboard(readme: str):
|
|
257 |
df_leaderboard = filter_dataframe_by_column_values(df_leaderboard)
|
258 |
df_leaderboard = filter_dataframe_by_model_type(df_leaderboard)
|
259 |
|
260 |
-
df_leaderboard = df_leaderboard.sort_values(by=['
|
261 |
-
df_leaderboard["Rank"] = df_leaderboard["
|
262 |
# move rank at 0-th column
|
263 |
# Ensure 'Rank' is the first column
|
264 |
cols = ['Rank'] + [col for col in df_leaderboard.columns if col != 'Rank']
|
|
|
265 |
df_leaderboard = df_leaderboard[cols]
|
266 |
|
267 |
print(df_leaderboard.columns)
|
@@ -316,10 +316,12 @@ def setup_disclaimer():
|
|
316 |
st.markdown("## Authors")
|
317 |
st.markdown(
|
318 |
"""
|
319 |
-
- [Andrea Bacciu](https://www.linkedin.com/in/andreabacciu/) (Work done prior joining Amazon)
|
320 |
-
- [Cesare Campagnano](https://www.linkedin.com/in/caesar-one/)
|
321 |
- [Giovanni Trappolini](https://www.linkedin.com/in/giovanni-trappolini/)
|
322 |
- [Professor Fabrizio Silvestri](https://www.linkedin.com/in/fabrizio-silvestri-a6b0391/)
|
|
|
|
|
323 |
"""
|
324 |
)
|
325 |
|
|
|
6 |
from pandas.api.types import is_bool_dtype, is_datetime64_any_dtype, is_numeric_dtype
|
7 |
|
8 |
GITHUB_URL = "https://github.com/RSTLess-research/"
|
9 |
+
NON_BENCHMARK_COLS = ["Publisher"]
|
10 |
|
11 |
|
12 |
def extract_table_and_format_from_markdown_text(markdown_table: str) -> pd.DataFrame:
|
|
|
247 |
leaderboard_table = extract_markdown_table_from_multiline(readme, table_headline="## Leaderboard")
|
248 |
leaderboard_table = remove_markdown_links(leaderboard_table)
|
249 |
df_leaderboard = extract_table_and_format_from_markdown_text(leaderboard_table)
|
|
|
250 |
|
251 |
st.markdown("## Leaderboard")
|
252 |
modify = st.checkbox("Add filters")
|
|
|
256 |
df_leaderboard = filter_dataframe_by_column_values(df_leaderboard)
|
257 |
df_leaderboard = filter_dataframe_by_model_type(df_leaderboard)
|
258 |
|
259 |
+
df_leaderboard = df_leaderboard.sort_values(by=['Avg.'], ascending=False)
|
260 |
+
df_leaderboard["Rank"] = df_leaderboard["Avg."].rank(ascending=False)
|
261 |
# move rank at 0-th column
|
262 |
# Ensure 'Rank' is the first column
|
263 |
cols = ['Rank'] + [col for col in df_leaderboard.columns if col != 'Rank']
|
264 |
+
|
265 |
df_leaderboard = df_leaderboard[cols]
|
266 |
|
267 |
print(df_leaderboard.columns)
|
|
|
316 |
st.markdown("## Authors")
|
317 |
st.markdown(
|
318 |
"""
|
319 |
+
- [Andrea Bacciu](https://www.linkedin.com/in/andreabacciu/)* (Work done prior joining Amazon)
|
320 |
+
- [Cesare Campagnano](https://www.linkedin.com/in/caesar-one/)*
|
321 |
- [Giovanni Trappolini](https://www.linkedin.com/in/giovanni-trappolini/)
|
322 |
- [Professor Fabrizio Silvestri](https://www.linkedin.com/in/fabrizio-silvestri-a6b0391/)
|
323 |
+
|
324 |
+
\*Equal contribution
|
325 |
"""
|
326 |
)
|
327 |
|