File size: 2,566 Bytes
955953c
 
299bb71
 
 
 
 
 
 
955953c
 
299bb71
955953c
299bb71
955953c
299bb71
 
 
955953c
299bb71
 
 
955953c
299bb71
955953c
299bb71
955953c
299bb71
955953c
54105b8
955953c
299bb71
 
 
 
 
 
 
 
7e9cdcf
54105b8
299bb71
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
---
library_name: transformers
license: cc-by-nc-4.0
language:
- en
- zh
base_model:
- meta-llama/Llama-3.2-3B-Instruct
pipeline_tag: text-generation
---

# Kyara: Knowledge Yielding Adaptive Retrieval Augmentation for LLM Fine-tuning

[![DOI](https://zenodo.org/badge/844304447.svg)](https://zenodo.org/badge/latestdoi/844304447)

<p align="left">
    🤗 <a href="https://huggingface.co/zake7749/Llama-3.2-3B-it-chinese-kyara/">Hugging Face</a>&nbsp; | 🚀<a href="https://github.com/zake7749/kyara">Github</a>&nbsp; | &nbsp;📑 <a href="#">Paper</a>&nbsp; | &nbsp;📖 <a href="https://github.com/zake7749/kyara/blob/main/document/README_EN.md">English</a>&nbsp; | &nbsp;📖 <a href="https://github.com/zake7749/kyara">Chinese</a>&nbsp; | &nbsp;💻 <a href="https://www.kaggle.com/code/zake7749/kyara-a-compact-yet-powerful-chinese-llm">Kaggle Notebook</a>
</p>

<div style="text-align: center;">
  <img src="https://i.imgur.com/QiWlcYJ.jpeg" alt="kyara"/>
</div>

Kyara (Knowledge Yielding Adaptive Retrieval Augmentation) is an experimental project aimed at improving language models through knowledge retrieval processes. The project seeks to enhance the model’s ability to adapt knowledge and improve language comprehension, particularly in underrepresented languages like Traditional Chinese. Given the relatively scarce availability of Traditional Chinese data compared to the vast corpus of English data used for model training, Kyara addresses this gap by expanding the limited corpus for this language.

This is a preview model, with the stable version set to be released soon.

## Benchmark

All evaluations are conducted in a zero-shot setting.

| Metric                   | Kyara-3b-it    | Llama3.2-3b-it |
|--------------------------|----------|-------------|
| **[TMMLUPlus](https://huggingface.co/datasets/ikala/tmmluplus)**            | **42.54** | 40.01    |
| &emsp;- STEM               | **45.17**   | 40.37      |
| &emsp;- Humanities         | **39.66**   | 38.65      |
| &emsp;- Other              | **41.18**   | 39.06      |
| &emsp;- Social-Science     | **44.16**   | 41.98      |
| **[MMLU-Redux](https://github.com/yuchenlin/ZeroEval)**    | **57.24**| 56.91       |
| **[GSM8K](https://github.com/yuchenlin/ZeroEval)**         | **67.25**| 57.16       |
| **[MATH-L5](https://github.com/yuchenlin/ZeroEval)**         | **19.97**| 16.23       |
| **[CRUX](https://github.com/yuchenlin/ZeroEval)**          | **31.25**| 25.25     |
| **[AlpacaEval](https://github.com/tatsu-lab/alpaca_eval)**    | **23.87**| 19.35  |