|
--- |
|
library_name: transformers |
|
tags: |
|
- unsloth |
|
- llama3 |
|
- indonesia |
|
license: llama3 |
|
datasets: |
|
- catinthebag/Tumpeng-1-Indonesian |
|
language: |
|
- id |
|
inference: false |
|
--- |
|
**Exllamav2** quant (**exl2** / **4.25 bpw**) made with ExLlamaV2 v0.1.3 |
|
|
|
Other EXL2 quants: |
|
| **Quant** | **Model Size** | **lm_head** | |
|
| ----- | ---------- | ------- | |
|
|<center>**[2.2](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-2_2bpw_exl2)**</center> | <center>3250 MB</center> | <center>6</center> | |
|
|<center>**[2.5](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-2_5bpw_exl2)**</center> | <center>3478 MB</center> | <center>6</center> | |
|
|<center>**[3.0](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-3_0bpw_exl2)**</center> | <center>3895 MB</center> | <center>6</center> | |
|
|<center>**[3.5](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-3_5bpw_exl2)**</center> | <center>4311 MB</center> | <center>6</center> | |
|
|<center>**[3.75](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-3_75bpw_exl2)**</center> | <center>4518 MB</center> | <center>6</center> | |
|
|<center>**[4.0](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-4_0bpw_exl2)**</center> | <center>4727 MB</center> | <center>6</center> | |
|
|<center>**[4.25](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-4_25bpw_exl2)**</center> | <center>4935 MB</center> | <center>6</center> | |
|
|<center>**[5.0](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-5_0bpw_exl2)**</center> | <center>5559 MB</center> | <center>6</center> | |
|
|<center>**[6.0](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-6_0bpw_exl2)**</center> | <center>6493 MB</center> | <center>8</center> | |
|
|<center>**[6.5](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-6_5bpw_exl2)**</center> | <center>6912 MB</center> | <center>8</center> | |
|
|<center>**[8.0](https://huggingface.co/Zoyd/afrizalha_Kancil-V1-llama3-fp16-8_0bpw_exl2)**</center> | <center>8116 MB</center> | <center>8</center> | |
|
|
|
<!DOCTYPE html> |
|
<html lang="en"> |
|
<head> |
|
<meta charset="UTF-8"> |
|
<meta name="viewport" content="width=device-width, initial-scale=1.0"> |
|
<title>Document Title</title> |
|
<style> |
|
h1 { |
|
font-size: 36px; |
|
color: navy; |
|
font-family: 'Tahoma'; |
|
text-align: center; |
|
} |
|
</style> |
|
</head> |
|
<body> |
|
<h1>Introducing the Kancil family of open models</h1> |
|
</body> |
|
</html> |
|
|
|
<center> |
|
<img src="https://imgur.com/9nG5J1T.png" alt="Kancil" width="600" height="300"> |
|
<p><em>Kancil is a fine-tuned version of Llama 3 8B using synthetic QA dataset generated with Llama 3 70B. Version zero of Kancil is the first generative Indonesian LLM gain functional instruction performance using solely synthetic data.</em></p> |
|
<p><strong><a href="https://colab.research.google.com/drive/1OOwb6bgFycOODHPcLaJtHk1ObcjG275C?usp=sharing" style="color: blue; font-family: Tahoma;">βGo straight to the colab demoβ</a></strong></p> |
|
<p><em style="color: black; font-weight: bold;">Beta preview</em></p> |
|
</center> |
|
|
|
Selamat datang! |
|
|
|
I am ultra-overjoyed to introduce you... the π¦ Kancil! It's a fine-tuned version of Llama 3 8B with the Tumpeng, an instruction dataset of 14.8 million words. Both the model and dataset is openly available in Huggingface. |
|
|
|
π The dataset was synthetically generated from Llama 3 70B. A big problem with existing Indonesian instruction dataset is they're in reality not-very-good-translations of English datasets. Llama 3 70B can generate fluent Indonesian! (with minor caveats π) |
|
|
|
π¦ This follows previous efforts for collection of open, fine-tuned Indonesian models, like Merak and Cendol. However, Kancil solely leverages synthetic data in a very creative way, which makes it a very unique contribution! |
|
|
|
### Version 1.0 |
|
|
|
This is the second working prototype, Kancil V1. |
|
β¨ Training |
|
- 2.2x Dataset word count |
|
- 2x lora parameters |
|
- Rank-stabilized lora |
|
- 2x fun |
|
|
|
β¨ New features |
|
- Multi-turn conversation (beta; optimized for curhat/personal advice π) |
|
- Better text generation (full or outline writing; optimized for essays) |
|
- QA from text (copy paste to prompt and ask a question about it) |
|
- Making slogans |
|
|
|
This model was fine-tuned with QLoRA using the amazing Unsloth framework! It was built on top of [unsloth/llama-3-8b-bnb-4bit](https://huggingface.co/unsloth/llama-3-8b-bnb-4bit) and subsequently merged with the adapter. |
|
|
|
### Uses |
|
|
|
This model is developed with research purposes for researchers or general AI hobbyists. However, it has one big application: You can have lots of fun with it! |
|
|
|
### Out-of-Scope Use |
|
|
|
This is a research preview model with minimal safety curation. Do not use this model for commercial or practical applications. |
|
|
|
You are also not allowed to use this model without having fun. |
|
|
|
### Getting started |
|
|
|
As mentioned, this model was trained with Unsloth. Please use its code for better experience. |
|
|
|
``` |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Available versions |
|
KancilV1 = "catinthebag/Kancil-V1-llama3-fp16" |
|
|
|
# Load the model |
|
tokenizer = AutoTokenizer.from_pretrained("catinthebag/Kancil-V1-llama3-fp16") |
|
model = AutoModelForCausalLM.from_pretrained("catinthebag/Kancil-V1-llama3-fp16") |
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model.to(device) |
|
``` |
|
``` |
|
# This model was trained on this specific prompt template. Changing it might lead to performance degradations. |
|
prompt_template = """<|user|> |
|
{prompt} |
|
|
|
<|assistant|> |
|
{response}""" |
|
|
|
# Start generating! |
|
inputs = tokenizer( |
|
[ |
|
prompt_template.format( |
|
prompt="""Bagaimana cara memberi tahu orang tua kalau saya ditolak universitas favorit saya?""", |
|
response="",) |
|
], return_tensors = "pt").to("cuda") |
|
|
|
outputs = model.generate(**inputs, max_new_tokens = 600, temperature=.3, use_cache = True) |
|
print(tokenizer.batch_decode(outputs)[0].replace('\\n', '\n')) |
|
``` |
|
|
|
**Note:** There is an issue with the dataset where the newline characters are interpreted as literal strings. Very sorry about this! π Please keep the .replace() method to fix newline errors. |
|
|
|
### Acknowledgments |
|
|
|
- **Developed by:** Afrizal Hasbi Azizy |
|
- **License:** Llama 3 Community License Agreement |