Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ license: apache-2.0
|
|
5 |
|
6 |
# Ukrainian finetuned Mistral-7B-Instruct-v0.2
|
7 |
|
8 |
-
|
9 |
|
10 |
|
11 |
## Instruction format
|
@@ -24,7 +24,15 @@ This instruction model is based on Mistral-7B-v0.2, a transformer model with the
|
|
24 |
- Grouped-Query Attention
|
25 |
- Sliding-Window Attention
|
26 |
- Byte-fallback BPE tokenizer
|
27 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
## 💻 Usage
|
29 |
|
30 |
```python
|
|
|
5 |
|
6 |
# Ukrainian finetuned Mistral-7B-Instruct-v0.2
|
7 |
|
8 |
+
Supervised finetuning of Mistral-7B-Instruct-v0.2 on ukrainian datasets.
|
9 |
|
10 |
|
11 |
## Instruction format
|
|
|
24 |
- Grouped-Query Attention
|
25 |
- Sliding-Window Attention
|
26 |
- Byte-fallback BPE tokenizer
|
27 |
+
|
28 |
+
## Datasets
|
29 |
+
- [UA-SQUAD](https://huggingface.co/datasets/FIdo-AI/ua-squad/resolve/main/ua_squad_dataset.json)
|
30 |
+
- [Ukrainian StackExchange](https://huggingface.co/datasets/zeusfsx/ukrainian-stackexchange)
|
31 |
+
- [UAlpaca Dataset](https://github.com/robinhad/kruk/blob/main/data/cc-by-nc/alpaca_data_translated.json)
|
32 |
+
- [Ukrainian Subset from Belebele Dataset](https://github.com/facebookresearch/belebele)
|
33 |
+
- [Ukrainian Subset from XQA](https://github.com/thunlp/XQA)
|
34 |
+
- TODO - Ukrainian Subset from MKQA
|
35 |
+
|
36 |
## 💻 Usage
|
37 |
|
38 |
```python
|