Model Card for Model ID

chat-vector ๋…ผ๋ฌธ( https://arxiv.org/abs/2310.04799v2 )์— ๊ทผ๊ฑฐํ•˜์—ฌ,

llama3์˜ pre-trained ๋ชจ๋ธ์˜ parameter์™€ instruction ๋ชจ๋ธ์˜ ๋งค๊ฐœ๋ณ€์ˆ˜์˜ ์ฐจ์ด๋ฅผ

beomi๋‹˜์˜ Llama-3-Open-Ko-8B์— ์ ์šฉํ•œ ๋ชจ๋ธ

maywell๋‹˜์˜ ์ด ๋ฐฉ๋ฒ•๋ก ( https://huggingface.co/blog/maywell/llm-feature-transfer )์„ ๋ฐ›์•„๋“ค์—ฌ ๊ฐ€์ค‘์น˜ ์—…๋ฐ์ดํŠธ

64GB์˜ ram ์‹œ์Šคํ…œ ํ•˜์—์„œ ์ง„ํ–‰ํ•˜๋‹ค๋ณด๋‹ˆ, ์ž๋ฃŒํ˜•์„ bf16ํ˜•ํƒœ๋กœ ์ง„ํ–‰ํ•˜์˜€์Œ

Metric

results/all/aeolian83/Llama-3-Open-Ko-8B-aeolian83-chatvec

0 5 10
kobest_boolq (macro_f1) 0.64898 0.603325 0.575417
kobest_copa (macro_f1) 0.682517 0.706718 0.693293
kobest_hellaswag (macro_f1) 0.42651 0.391038 0.386523
kobest_sentineg (macro_f1) 0.501351 0.861108 0.876122
kohatespeech (macro_f1) 0.252714 0.330103 0.305009
kohatespeech_apeach (macro_f1) 0.337667 0.536842 0.526639
kohatespeech_gen_bias (macro_f1) 0.124535 0.512855 0.457998
korunsmile (f1) 0.358703 0.330155 0.32824
nsmc (acc) 0.59726 0.75206 0.74702
pawsx_ko (acc) 0.5195 0.513 0.4805

Used Model

  • Base model(weight diff๋ฅผ ๊ตฌํ•˜๊ธฐ ์œ„ํ•œ ๋ฒ ์ด์Šค ๋ชจ๋ธ) : meta-llama/Meta-Llama-3-8B
  • Chat model(weight diff๋ฅผ ์ œ๊ณตํ•˜๋Š” instruction model) : meta-llama/Meta-Llama-3-8B-Instruct
  • Target model(weight diff๋ฅผ ์ ์šฉํ•ด์„œ instruction ํŠ ์„ ํ•˜๊ณ ์ž ํ•˜๋Š” ๋ชจ๋ธ) : beomi/Llama-3-Open-Ko-8B
Downloads last month
17
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.