frontend / src /display /about.py
MirakramAghalarov's picture
Updated README
5e1fdc4
from dataclasses import dataclass
from enum import Enum
import json
@dataclass
class Task:
benchmark: str
metric: str
col_name: str
# Init: to update with your specific keys
def create_task_list():
# task_key in the json file, metric_key in the json file, name to display in the leaderboard
with open("src/datasets.json") as f:
data = json.load(f)
groups = []
names = []
for d in data:
groups.append(d['group'])
names.append(d['name'])
groups = list(set(groups))
tasks = []
grouped_tasks = []
for name in names:
tasks.append(Task(name, "metric_name", name))
for group in groups:
grouped_tasks.append(Task(group, "metric_name", group))
return tasks, grouped_tasks
# Your leaderboard name
TITLE = """<h1 align="center" id="space-title"> Azerbaijani LLM Leaderboard</h1>"""
# What does your leaderboard evaluate?
INTRODUCTION_TEXT = """
Welcome to Kapital Bank's Azerbaijani LLM Leaderboard. We use benchmarks in finance, banking, and general knowledge for accurate evaluations.
🚀 Submit Your Model 🚀
If you have a fine-tuned Azerbaijani LLM, submit it for evaluation!
"""
LLM_BENCHMARKS_TEXT = f"""
## Azerbaijani Open LLM sponsored by Kapital Bank
Azerbaijani Open LLM Leaderboard is a pioneering initiative dedicated to advancing and showcasing Azerbaijani language large language models (LLMs). Sponsored by Kapital Bank, this leaderboard provides a transparent and comprehensive ranking platform for open-source Azerbaijani LLMs, fostering innovation in natural language processing (NLP) within the Azerbaijani language. By creating a space for collaboration and healthy competition, we aim to support researchers, developers, and the broader AI community in improving the quality, accessibility, and practical applications of Azerbaijani-focused LLMs. Through this platform, we hope to bridge language gaps in AI technology and drive forward advancements in multilingual AI, all while encouraging the development of AI resources that are locally relevant and globally competitive.
## Partners
Along with Kapital Bank, some of the companies and groups collaborated in this approach. LocalDocs, PRODATA LLC and R&D Center of Baku Higher Oil School.
"""
LLM_DATASET_TEXT = f"""
## Banking_Call_Classification_MC
This dataset consists of 192 rows and 4 columns. It is a multiple-choice dataset used to determine which of the presented categories the subject of a request sent to the bank by a client belongs to.
## Banking_Exam_MCQ
A benchmark dataset of 200-300 multiple-choice questions sourced from universityexam materials across multiple departments,focused specifically on the banking sector in Azerbaijan.
## Banking_QA
This dataset consists of 97 raws and is a question-answer dataset in the Azerbaijani language about banking.
## Wiki_CQA
This database consists of 97 rows in Azerbaijani language. It consists of a test consisting of a context from Wikipedia, questions related to that context, and a created answer.
## GSM8K
A benchmark dataset contain 44 rows,diverse grade school math word problems to measure a model’s abilitiy to solve milti-step mathematical reasoning problems.
## ARC
This Benchmark dataset consists of multiple-choice science questions aimed at testing a model's ability to understand and apply elementary scientific knowledge, similar to questions that might appear in standard science exams for students. This version of the dataset is in Azerbaijani, providing an opportunity for models to engage in reasoning and inference in the Azerbaijani language. The dataset is divided into an easy set and a challenge set, with questions requiring reasoning beyond simple fact recall
## Informatics_MC, Azerbaijani_Lang_MC, History_MC, Physics_MC, Geography_MC, LLM-Literature_MC, Logic_MC, Azerbaijani_Hist_MC, Chemistry_MC, Biology_MC
A comprehensive collection of educational datasets in the Azerbaijani language, covering ten distinct academic disciplines: informatics, Azerbaijani language, world history, physics, geography, literature, logic, Azerbaijani history, chemistry, and biology. Each dataset contains 100 carefully curated multiple-choice questions, designed to assess knowledge and understanding in their respective fields.
"""
EVALUATION_QUEUE_TEXT = """
## Some good practices before submitting a model
### 1) Make sure your model exists on hub.
### 2) Make sure your model is public.
## In case of model failure
If your model is displayed in the `FAILED` category, its execution stopped.
Make sure you have followed the above steps first.
Please contact us if you are facing any trouble!
"""