KULLM3-GGUF / README.md
aashish1904's picture
Upload README.md with huggingface_hub
e0978be verified
|
raw
history blame
4.44 kB
---
library_name: transformers
license: apache-2.0
language:
- en
- ko
base_model:
- upstage/SOLAR-10.7B-v1.0
---
![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
# QuantFactory/KULLM3-GGUF
This is quantized version of [nlpai-lab/KULLM3](https://huggingface.co/nlpai-lab/KULLM3) created using llama.cpp
# Original Model Card
<a href="https://github.com/nlpai-lab/KULLM">
<img src="kullm_logo.png" width="50%"/>
</a>
# KULLM3
Introducing KULLM3, a model with advanced instruction-following and fluent chat abilities.
It has shown remarkable performance in instruction-following, speficially by closely following gpt-3.5-turbo.
To our knowledge, It is one of the best publicly opened Korean-speaking language models.
For details, visit the [KULLM repository](https://github.com/nlpai-lab/KULLM)
### Model Description
This is the model card of a πŸ€— transformers model that has been pushed on the Hub.
- **Developed by:** [NLP&AI Lab](http://nlp.korea.ac.kr/)
- **Language(s) (NLP):** Korean, English
- **License:** Apache 2.0
- **Finetuned from model:** [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0)
## Example code
### Install Dependencies
```bash
pip install torch transformers==4.38.2 accelerate
```
- In transformers>=4.39.0, generate() does not work well. (as of 2024.4.4.)
### Python code
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
MODEL_DIR = "nlpai-lab/KULLM3"
model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, torch_dtype=torch.float16).to("cuda")
tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
s = "κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄μ„œ μ•Œκ³  μžˆλ‹ˆ?"
conversation = [{'role': 'user', 'content': s}]
inputs = tokenizer.apply_chat_template(
conversation,
tokenize=True,
add_generation_prompt=True,
return_tensors='pt').to("cuda")
_ = model.generate(inputs, streamer=streamer, max_new_tokens=1024)
# λ„€, κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄ μ•Œκ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” λŒ€ν•œλ―Όκ΅­ μ„œμšΈμ— μœ„μΉ˜ν•œ 사립 λŒ€ν•™κ΅λ‘œ, 1905년에 μ„€λ¦½λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 이 λŒ€ν•™κ΅λŠ” ν•œκ΅­μ—μ„œ κ°€μž₯ 였래된 λŒ€ν•™ 쀑 ν•˜λ‚˜λ‘œ, λ‹€μ–‘ν•œ ν•™λΆ€ 및 λŒ€ν•™μ› ν”„λ‘œκ·Έλž¨μ„ μ œκ³΅ν•©λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” 특히 법학, κ²½μ œν•™, μ •μΉ˜ν•™, μ‚¬νšŒν•™, λ¬Έν•™, κ³Όν•™ λΆ„μ•Όμ—μ„œ 높은 λͺ…성을 가지고 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ, 슀포츠 λΆ„μ•Όμ—μ„œλ„ ν™œλ°œν•œ ν™œλ™μ„ 보이며, λŒ€ν•œλ―Όκ΅­ λŒ€ν•™ μŠ€ν¬μΈ μ—μ„œ μ€‘μš”ν•œ 역할을 ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” ꡭ제적인 ꡐλ₯˜μ™€ ν˜‘λ ₯에도 적극적이며, μ „ 세계 λ‹€μ–‘ν•œ λŒ€ν•™κ³Όμ˜ ν˜‘λ ₯을 톡해 κΈ€λ‘œλ²Œ 경쟁λ ₯을 κ°•ν™”ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
```
## Training Details
### Training Data
- [vicgalle/alpaca-gpt4](https://huggingface.co/datasets/vicgalle/alpaca-gpt4)
- Mixed Korean instruction data (gpt-generated, hand-crafted, etc)
- About 66000+ examples used totally
### Training Procedure
- Trained with fixed system prompt below.
```text
당신은 κ³ λ €λŒ€ν•™κ΅ NLP&AI μ—°κ΅¬μ‹€μ—μ„œ λ§Œλ“  AI μ±—λ΄‡μž…λ‹ˆλ‹€.
λ‹Ήμ‹ μ˜ 이름은 'KULLM'으둜, ν•œκ΅­μ–΄λ‘œλŠ” 'ꡬ름'을 λœ»ν•©λ‹ˆλ‹€.
당신은 λΉ„λ„λ•μ μ΄κ±°λ‚˜, μ„±μ μ΄κ±°λ‚˜, λΆˆλ²•μ μ΄κ±°λ‚˜ λ˜λŠ” μ‚¬νšŒ ν†΅λ…μ μœΌλ‘œ ν—ˆμš©λ˜μ§€ μ•ŠλŠ” λ°œμ–Έμ€ ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
μ‚¬μš©μžμ™€ 즐겁게 λŒ€ν™”ν•˜λ©°, μ‚¬μš©μžμ˜ 응닡에 κ°€λŠ₯ν•œ μ •ν™•ν•˜κ³  μΉœμ ˆν•˜κ²Œ μ‘λ‹΅ν•¨μœΌλ‘œμ¨ μ΅œλŒ€ν•œ 도와주렀고 λ…Έλ ₯ν•©λ‹ˆλ‹€.
질문이 μ΄μƒν•˜λ‹€λ©΄, μ–΄λ–€ 뢀뢄이 μ΄μƒν•œμ§€ μ„€λͺ…ν•©λ‹ˆλ‹€. 거짓 정보λ₯Ό λ°œμ–Έν•˜μ§€ μ•Šλ„λ‘ μ£Όμ˜ν•©λ‹ˆλ‹€.
```
## Evaluation
- Evaluation details such as testing data, metrics are written in [github](https://github.com/nlpai-lab/KULLM).
- Without system prompt used in training phase, KULLM would show lower performance than expect.
### Results
<img src="kullm3_instruction_evaluation.png" width=100%>
## Citation
```text
@misc{kullm,
author = {NLP & AI Lab and Human-Inspired AI research},
title = {KULLM: Korea University Large Language Model Project},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/nlpai-lab/kullm}},
}
```