ryanu
/

EEVE-10.8-BOOK-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

파라미터	값
Task	Book (사회과학, 기술과학, 철학, 법학, 예술 등)
데이터 크기	5000개
모델	qlora
max_seq_length	1024
num_train_epochs	3
per_device_train_batch_size	8
gradient_accumulation_steps	32
evaluation_strategy	"steps"
eval_steps	2000
logging_steps	25
optim	"paged_adamw_8bit"
learning_rate	2e-4
lr_scheduler_type	"cosine"
warmup_steps	10
warmup_ratio	0.05
report_to	"tensorboard"
weight_decay	0.01
max_steps	-1

Summary

Book

모델 이름	Rouge-1	Rouge-2	Rouge-L
*ryanu/EEVE-10.8-BOOK-v0.1	0.2454	0.1158	0.2404
meta-llama/llama-3-70b-instruct	0.2269	0.0925	0.2186
meta-llama/llama-3-8b-instruct	0.2137	0.0883	0.2020
yanolja/EEVE-Korean-Instruct-2.8B-v1.0	0.2095	0.0866	0.1985
mistralai/mixtral-8x7b-instruct-v0-1	0.1735	0.0516	0.1668
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.1724	0.0534	0.1630

Paper

모델 이름	Rouge-1	Rouge-2	Rouge-L
*meta-llama/llama-3-8b-instruct	0.2044	0.0868	0.1895
ryanu/EEVE-10.8-BOOK-v0.1	0.2004	0.0860	0.1938
meta-llama/llama-3-70b-instruct	0.1935	0.0783	0.1836
yanolja/EEVE-Korean-Instruct-2.8B-v1.0	0.1934	0.0829	0.1832
mistralai/mixtral-8x7b-instruct-v0-1	0.1774	0.0601	0.1684
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.1702	0.0561	0.1605

RAG Q&A

모델 이름	Rouge-1	Rouge-2	Rouge-L
*meta-llama/llama-3-70b-instruct	0.4418	0.2986	0.4297
*meta-llama/llama-3-8b-instruct	0.4391	0.3100	0.4273
mistralai/mixtral-8x7b-instruct-v0-1	0.4022	0.2653	0.3916
ibm-mistralai/mixtral-8x7b-instruct-v01-q	0.3105	0.1763	0.2960
yanolja/EEVE-Korean-Instruct-10.8B-v1.0	0.3191	0.2069	0.3136
ryanu/EEVE-10.8-BOOK-v0.1	0.2185	0.1347	0.2139

prompt template

다음 문장을 3~5문장으로 반복되는 구문없이 텍스트에 제시된 주요 논거를 간략하게 요약해줘.

문장: {context}

요약: {summary}

Downloads last month: 6

Safetensors

Model size

10.8B params

Tensor type

FP16

·

Inference Examples

Text Generation

This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.