aashish1904 commited on
Commit
e0978be
β€’
1 Parent(s): eeae735

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +114 -0
README.md ADDED
@@ -0,0 +1,114 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ library_name: transformers
5
+ license: apache-2.0
6
+ language:
7
+ - en
8
+ - ko
9
+ base_model:
10
+ - upstage/SOLAR-10.7B-v1.0
11
+
12
+ ---
13
+
14
+ ![](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)
15
+
16
+ # QuantFactory/KULLM3-GGUF
17
+ This is quantized version of [nlpai-lab/KULLM3](https://huggingface.co/nlpai-lab/KULLM3) created using llama.cpp
18
+
19
+ # Original Model Card
20
+
21
+
22
+ <a href="https://github.com/nlpai-lab/KULLM">
23
+ <img src="kullm_logo.png" width="50%"/>
24
+ </a>
25
+
26
+ # KULLM3
27
+ Introducing KULLM3, a model with advanced instruction-following and fluent chat abilities.
28
+ It has shown remarkable performance in instruction-following, speficially by closely following gpt-3.5-turbo.
29
+ To our knowledge, It is one of the best publicly opened Korean-speaking language models.
30
+
31
+ For details, visit the [KULLM repository](https://github.com/nlpai-lab/KULLM)
32
+
33
+ ### Model Description
34
+
35
+ This is the model card of a πŸ€— transformers model that has been pushed on the Hub.
36
+
37
+ - **Developed by:** [NLP&AI Lab](http://nlp.korea.ac.kr/)
38
+ - **Language(s) (NLP):** Korean, English
39
+ - **License:** Apache 2.0
40
+ - **Finetuned from model:** [upstage/SOLAR-10.7B-v1.0](https://huggingface.co/upstage/SOLAR-10.7B-v1.0)
41
+
42
+ ## Example code
43
+ ### Install Dependencies
44
+ ```bash
45
+ pip install torch transformers==4.38.2 accelerate
46
+ ```
47
+
48
+ - In transformers>=4.39.0, generate() does not work well. (as of 2024.4.4.)
49
+
50
+ ### Python code
51
+ ```python
52
+ import torch
53
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
54
+
55
+ MODEL_DIR = "nlpai-lab/KULLM3"
56
+ model = AutoModelForCausalLM.from_pretrained(MODEL_DIR, torch_dtype=torch.float16).to("cuda")
57
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_DIR)
58
+ streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=True)
59
+
60
+ s = "κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄μ„œ μ•Œκ³  μžˆλ‹ˆ?"
61
+ conversation = [{'role': 'user', 'content': s}]
62
+ inputs = tokenizer.apply_chat_template(
63
+ conversation,
64
+ tokenize=True,
65
+ add_generation_prompt=True,
66
+ return_tensors='pt').to("cuda")
67
+ _ = model.generate(inputs, streamer=streamer, max_new_tokens=1024)
68
+
69
+ # λ„€, κ³ λ €λŒ€ν•™κ΅μ— λŒ€ν•΄ μ•Œκ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” λŒ€ν•œλ―Όκ΅­ μ„œμšΈμ— μœ„μΉ˜ν•œ 사립 λŒ€ν•™κ΅λ‘œ, 1905년에 μ„€λ¦½λ˜μ—ˆμŠ΅λ‹ˆλ‹€. 이 λŒ€ν•™κ΅λŠ” ν•œκ΅­μ—μ„œ κ°€μž₯ 였래된 λŒ€ν•™ 쀑 ν•˜λ‚˜λ‘œ, λ‹€μ–‘ν•œ ν•™λΆ€ 및 λŒ€ν•™μ› ν”„λ‘œκ·Έλž¨μ„ μ œκ³΅ν•©λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” 특히 법학, κ²½μ œν•™, μ •μΉ˜ν•™, μ‚¬νšŒν•™, λ¬Έν•™, κ³Όν•™ λΆ„μ•Όμ—μ„œ 높은 λͺ…성을 가지고 μžˆμŠ΅λ‹ˆλ‹€. λ˜ν•œ, 슀포츠 λΆ„μ•Όμ—μ„œλ„ ν™œλ°œν•œ ν™œλ™μ„ 보이며, λŒ€ν•œλ―Όκ΅­ λŒ€ν•™ μŠ€ν¬μΈ μ—μ„œ μ€‘μš”ν•œ 역할을 ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€. κ³ λ €λŒ€ν•™κ΅λŠ” ꡭ제적인 ꡐλ₯˜μ™€ ν˜‘λ ₯에도 적극적이며, μ „ 세계 λ‹€μ–‘ν•œ λŒ€ν•™κ³Όμ˜ ν˜‘λ ₯을 톡해 κΈ€λ‘œλ²Œ 경쟁λ ₯을 κ°•ν™”ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€.
70
+ ```
71
+
72
+
73
+ ## Training Details
74
+
75
+ ### Training Data
76
+
77
+ - [vicgalle/alpaca-gpt4](https://huggingface.co/datasets/vicgalle/alpaca-gpt4)
78
+ - Mixed Korean instruction data (gpt-generated, hand-crafted, etc)
79
+ - About 66000+ examples used totally
80
+
81
+ ### Training Procedure
82
+
83
+ - Trained with fixed system prompt below.
84
+
85
+ ```text
86
+ 당신은 κ³ λ €λŒ€ν•™κ΅ NLP&AI μ—°κ΅¬μ‹€μ—μ„œ λ§Œλ“  AI μ±—λ΄‡μž…λ‹ˆλ‹€.
87
+ λ‹Ήμ‹ μ˜ 이름은 'KULLM'으둜, ν•œκ΅­μ–΄λ‘œλŠ” 'ꡬ름'을 λœ»ν•©λ‹ˆλ‹€.
88
+ 당신은 λΉ„λ„λ•μ μ΄κ±°λ‚˜, μ„±μ μ΄κ±°λ‚˜, λΆˆλ²•μ μ΄κ±°λ‚˜ λ˜λŠ” μ‚¬νšŒ ν†΅λ…μ μœΌλ‘œ ν—ˆμš©λ˜μ§€ μ•ŠλŠ” λ°œμ–Έμ€ ν•˜μ§€ μ•ŠμŠ΅λ‹ˆλ‹€.
89
+ μ‚¬μš©μžμ™€ 즐겁게 λŒ€ν™”ν•˜λ©°, μ‚¬μš©μžμ˜ 응닡에 κ°€λŠ₯ν•œ μ •ν™•ν•˜κ³  μΉœμ ˆν•˜κ²Œ μ‘λ‹΅ν•¨μœΌλ‘œμ¨ μ΅œλŒ€ν•œ 도와주렀고 λ…Έλ ₯ν•©λ‹ˆλ‹€.
90
+ 질문이 μ΄μƒν•˜λ‹€λ©΄, μ–΄λ–€ 뢀뢄이 μ΄μƒν•œμ§€ μ„€λͺ…ν•©λ‹ˆλ‹€. 거짓 정보λ₯Ό λ°œμ–Έν•˜μ§€ μ•Šλ„λ‘ μ£Όμ˜ν•©λ‹ˆλ‹€.
91
+ ```
92
+
93
+ ## Evaluation
94
+
95
+ - Evaluation details such as testing data, metrics are written in [github](https://github.com/nlpai-lab/KULLM).
96
+ - Without system prompt used in training phase, KULLM would show lower performance than expect.
97
+
98
+ ### Results
99
+
100
+ <img src="kullm3_instruction_evaluation.png" width=100%>
101
+
102
+
103
+ ## Citation
104
+
105
+ ```text
106
+ @misc{kullm,
107
+ author = {NLP & AI Lab and Human-Inspired AI research},
108
+ title = {KULLM: Korea University Large Language Model Project},
109
+ year = {2023},
110
+ publisher = {GitHub},
111
+ journal = {GitHub repository},
112
+ howpublished = {\url{https://github.com/nlpai-lab/kullm}},
113
+ }
114
+ ```