|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Kundyzka/informatics_kaz |
|
language: |
|
- kk |
|
metrics: |
|
- name: F1 (Before Training) |
|
type: F1 Score |
|
value: 24.586 |
|
- name: Exact Match (Before Training) |
|
type: Exact Match |
|
value: 11.818 |
|
- name: F1 (After Training) |
|
type: F1 Score |
|
value: 63.317 |
|
- name: Exact Match (After Training) |
|
type: Exact Match |
|
value: 43.162 |
|
base_model: |
|
- google-bert/bert-base-multilingual-cased |
|
new_version: Kundyzka/bert-base-multilingual-informatics-kaz |
|
pipeline_tag: question-answering |
|
library_name: adapter-transformers |
|
tags: |
|
- computerscience |
|
- informatics |
|
--- |
|
|
|
# Description |
|
|
|
This model is a fine-tuned version of `google-bert/bert-base-multilingual-cased` using the `Kundyzka/informatics_kaz` dataset. Developed by **Kundyz Maksutova**, PhD Candidate, this model is specifically optimized for question-answering tasks in the Kazakh language, with a focus on computer science and informatics. |
|
|
|
### Key Features: |
|
- **Developer**: Kundyz Maksutova, PhD Candidate |
|
- **Base Model**: `google-bert/bert-base-multilingual-cased` |
|
- **Dataset**: `Kundyzka/informatics_kaz` |
|
- **Language**: Kazakh (`kk`) |
|
- **Task**: Question Answering (`pipeline_tag: question-answering`) |
|
- **Library**: `adapter-transformers` |
|
|
|
### Performance: |
|
This model demonstrates significant improvements after fine-tuning, as highlighted by the following metrics: |
|
|
|
- **Before Training**: |
|
- F1 Score: 24.586 |
|
- Exact Match (EM): 11.818 |
|
- **After Training**: |
|
- F1 Score: 63.317 |
|
- Exact Match (EM): 43.162 |
|
|
|
These metrics were evaluated on the `Kundyzka/informatics_kaz` dataset, indicating a substantial enhancement in the model’s ability to handle domain-specific questions. |
|
|
|
### Intended Use: |
|
This model is intended for question-answering applications in the Kazakh language. Potential use cases include: |
|
- **Educational Platforms**: Assisting students with queries in informatics and computer science. |
|
- **Research Projects**: Supporting research in Kazakh natural language processing. |
|
- **AI Applications**: Enhancing intelligent systems, chatbots, or virtual assistants requiring Kazakh language support. |
|
|
|
### Limitations: |
|
- **Domain-Specific Training**: The model is optimized for informatics and computer science topics, and performance may degrade on unrelated queries. |
|
- **Language Support**: The model supports only the Kazakh language and does not handle multilingual tasks. |
|
- **Bias**: Potential biases in the dataset may influence model outputs. |
|
|
|
### Tags: |
|
- `computerscience` |
|
- `informatics` |
|
- `question-answering` |
|
- `Kazakh` |
|
- `adapter-transformers` |
|
|
|
This model is a step forward in enabling high-quality question-answering systems for low-resource languages like Kazakh. For further details, customization, or fine-tuning, refer to the model repository. |
|
|