Update README.md
Browse files
README.md
CHANGED
@@ -5,16 +5,16 @@ datasets:
|
|
5 |
language:
|
6 |
- kk
|
7 |
metrics:
|
8 |
-
- name: F1
|
9 |
-
type: F1 Score
|
10 |
value: 31.405
|
11 |
-
- name: Exact Match (
|
12 |
type: Exact Match
|
13 |
value: 14.675
|
14 |
-
- name: F1 (
|
15 |
type: F1 Score
|
16 |
value: 56.819
|
17 |
-
- name: Exact Match (
|
18 |
type: Exact Match
|
19 |
value: 35.454
|
20 |
base_model:
|
@@ -33,13 +33,18 @@ This model was developed by **Kundyz Maksutova**, PhD Candidate, as part of rese
|
|
33 |
- **Dataset**: `Kundyzka/informatics_kaz`
|
34 |
- **Language**: Kazakh (`kk`)
|
35 |
- **Task**: Question Answering
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
|
|
|
|
|
|
|
|
|
|
43 |
|
44 |
### Dataset:
|
45 |
The `Kundyzka/informatics_kaz` dataset is curated to provide a diverse set of questions and answers in Kazakh, primarily targeting topics in computer science. This dataset ensures the model handles domain-specific terminology effectively.
|
@@ -54,3 +59,10 @@ This model is designed for answering questions in the Kazakh language, with appl
|
|
54 |
- **Domain-Specific Bias**: Performance may drop on topics outside computer science.
|
55 |
- **Dataset Bias**: Potential biases from the dataset can influence model outputs.
|
56 |
- **Language Support**: The model is optimized for Kazakh and does not support other languages.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
language:
|
6 |
- kk
|
7 |
metrics:
|
8 |
+
- name: F1 (Before Training)
|
9 |
+
type: F1 Score
|
10 |
value: 31.405
|
11 |
+
- name: Exact Match (Before Training)
|
12 |
type: Exact Match
|
13 |
value: 14.675
|
14 |
+
- name: F1 (After Training)
|
15 |
type: F1 Score
|
16 |
value: 56.819
|
17 |
+
- name: Exact Match (After Training)
|
18 |
type: Exact Match
|
19 |
value: 35.454
|
20 |
base_model:
|
|
|
33 |
- **Dataset**: `Kundyzka/informatics_kaz`
|
34 |
- **Language**: Kazakh (`kk`)
|
35 |
- **Task**: Question Answering
|
36 |
+
|
37 |
+
### Performance:
|
38 |
+
This model demonstrates significant improvements after fine-tuning, as shown by the following metrics:
|
39 |
+
|
40 |
+
- **Before Training**:
|
41 |
+
- F1 Score: 31.405
|
42 |
+
- Exact Match (EM): 14.675
|
43 |
+
- **After Training**:
|
44 |
+
- F1 Score: 56.819
|
45 |
+
- Exact Match (EM): 35.454
|
46 |
+
|
47 |
+
These metrics highlight the enhanced ability of the model to handle domain-specific questions after training on the `Kundyzka/informatics_kaz` dataset.
|
48 |
|
49 |
### Dataset:
|
50 |
The `Kundyzka/informatics_kaz` dataset is curated to provide a diverse set of questions and answers in Kazakh, primarily targeting topics in computer science. This dataset ensures the model handles domain-specific terminology effectively.
|
|
|
59 |
- **Domain-Specific Bias**: Performance may drop on topics outside computer science.
|
60 |
- **Dataset Bias**: Potential biases from the dataset can influence model outputs.
|
61 |
- **Language Support**: The model is optimized for Kazakh and does not support other languages.
|
62 |
+
|
63 |
+
### Tags:
|
64 |
+
- `computerscience`
|
65 |
+
- `question-answering`
|
66 |
+
- `Kazakh`
|
67 |
+
|
68 |
+
This model represents a significant step toward advancing natural language processing tools for low-resource languages like Kazakh. For further details or customization, refer to the model repository.
|