|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Kundyzka/informatics_kaz |
|
language: |
|
- kk |
|
metrics: |
|
- name: F1 (Before Training) |
|
type: F1 Score |
|
value: 31.405 |
|
- name: Exact Match (Before Training) |
|
type: Exact Match |
|
value: 14.675 |
|
- name: F1 (After Training) |
|
type: F1 Score |
|
value: 56.819 |
|
- name: Exact Match (After Training) |
|
type: Exact Match |
|
value: 35.454 |
|
base_model: |
|
- Kyrmasch/t5-kazakh-qa |
|
new_version: Kundyzka/t5-kazakh-qa-informatics-kaz |
|
pipeline_tag: question-answering |
|
library_name: adapter-transformers |
|
--- |
|
|
|
# Description |
|
|
|
This model was developed by **Kundyz Maksutova**, PhD Candidate, as part of research on improving question-answering systems in the Kazakh language. It is a fine-tuned version of `Kyrmasch/t5-kazakh-qa` on the `Kundyzka/informatics_kaz` dataset. The model is specifically optimized for question-answering tasks in Kazakh, focusing on the domain of computer science and related fields. |
|
|
|
### Key Features: |
|
- **Developer**: Kundyz Maksutova, PhD Candidate |
|
- **Base Model**: `Kyrmasch/t5-kazakh-qa` |
|
- **Dataset**: `Kundyzka/informatics_kaz` |
|
- **Language**: Kazakh (`kk`) |
|
- **Task**: Question Answering |
|
|
|
### Performance: |
|
This model demonstrates significant improvements after fine-tuning, as shown by the following metrics: |
|
|
|
- **Before Training**: |
|
- F1 Score: 31.405 |
|
- Exact Match (EM): 14.675 |
|
- **After Training**: |
|
- F1 Score: 56.819 |
|
- Exact Match (EM): 35.454 |
|
|
|
These metrics highlight the enhanced ability of the model to handle domain-specific questions after training on the `Kundyzka/informatics_kaz` dataset. |
|
|
|
### Dataset: |
|
The `Kundyzka/informatics_kaz` dataset is curated to provide a diverse set of questions and answers in Kazakh, primarily targeting topics in computer science. This dataset ensures the model handles domain-specific terminology effectively. |
|
|
|
### Intended Use: |
|
This model is designed for answering questions in the Kazakh language, with applications in: |
|
- **Educational Platforms**: Supporting students in learning computer science. |
|
- **Research Projects**: Facilitating studies in Kazakh natural language processing. |
|
- **Applications**: Powering intelligent systems like chatbots or question-answering assistants. |
|
|
|
### Limitations and Ethical Considerations: |
|
- **Domain-Specific Bias**: Performance may drop on topics outside computer science. |
|
- **Dataset Bias**: Potential biases from the dataset can influence model outputs. |
|
- **Language Support**: The model is optimized for Kazakh and does not support other languages. |
|
|
|
### Tags: |
|
- `computerscience` |
|
- `question-answering` |
|
- `Kazakh` |
|
|
|
This model represents a significant step toward advancing natural language processing tools for low-resource languages like Kazakh. For further details or customization, refer to the model repository. |