metadata
library_name: transformers
license: cc-by-nc-4.0
base_model:
- anthracite-org/magnum-v4-12b
- mistralai/Mistral-Nemo-Instruct-2407
- werty1248/Mistral-Nemo-NT-Ko-12B-dpo
tags:
- mergekit
- merge
language:
- ko
- en
spow12/MK_Nemo_12B
Model Description
This model is a Supervised fine-tuned version of mistralai/Mistral-Nemo-Instruct-2407 with DeepSpeed and trl for korean.
Merge methods.
models:
- model: anthracite-org/magnum-v4-12b
- model: mistralai/Mistral-Nemo-Instruct-2407
- model: spow12/Mistral-Nemo-Instruct-2407_sft_ver_4.4(private)
- model: werty1248/Mistral-Nemo-NT-Ko-12B-dpo
merge_method: model_stock
base_model: spow12/Mistral-Nemo-Instruct-2407_sft_ver_4.4(private)
dtype: bfloat16
Trained Data
- Trained with public, private data (about 130K)
Usage
from transformers import TextStreamer, pipeline, AutoTokenizer, AutoModelForCausalLM
model_id = 'spow12/MK_Nemo_12B'
tokenizer = AutoTokenizer.from_pretrained(model_id)
# %%
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
attn_implementation="flash_attention_2", #Optional
device_map='auto',
)
model.eval()
pipe = pipeline("text-generation", model=model, tokenizer=tokenizer, device_map='auto')
generation_configs = dict(
max_new_tokens=2048,
num_return_sequences=1,
temperature=0.75,
# repetition_penalty=1.1,
do_sample=True,
top_k=20,
top_p=0.9,
min_p=0.1,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.eos_token_id,
streamer = TextStreamer(tokenizer) # Optional, if you want to use streamer, you have to set num_beams=1
)
sys_message = """λΉμ μ μΉμ ν μ±λ΄μΌλ‘μ μλλ°©μ μμ²μ μ΅λν μμΈνκ³ μΉμ νκ² λ΅ν΄μΌν©λλ€.
μ¬μ©μκ° μ 곡νλ μ 보λ₯Ό μΈμ¬νκ² λΆμνμ¬ μ¬μ©μμ μλλ₯Ό μ μνκ² νμ
νκ³ κ·Έμ λ°λΌ λ΅λ³μ μμ±ν΄μΌν©λλ€.
νμ λ§€μ° μμ°μ€λ¬μ΄ νκ΅μ΄λ‘ μλ΅νμΈμ."""
message = [
{
'role': "system",
'content': sys_message
},
{
'role': 'user',
'content': "νμ¬μ κ²½μ μν©μ λν΄ μ΄λ»κ² μκ°ν΄?."
}
]
conversation = pipe(message, **generation_configs)
conversation[-1]
#output
νμ¬μ κ²½μ μν©μ κ°κ΅λ§λ€ λ€λ₯΄λ©°, μ λ°μ μΌλ‘λ μ½λ‘λ19 ν¬λ°λ―Ήμ μν₯μΌλ‘ ν° ν격μ λ°μ μνμ
λλ€. λ§μ κ΅κ°μμ κ²½μ μ±μ₯λ₯ μ΄ κ°μνκ³ μ€μ
λ₯ μ΄ μμΉνμ΅λλ€. κ·Έλ¬λ κ°κ΅ μ λΆλ μ¬μ κ³Ό ν΅ν μ μ±
μ ν΅ν΄ κ²½μ λ₯Ό μ§μ§νκ³ λ³΅κ΅¬νκΈ° μν΄ λ
Έλ ₯νκ³ μμ΅λλ€. μ½λ‘λ19 λ°±μ μ κ°λ°κ³Ό λ°°ν¬κ° κ²½μ ν볡μ λμμ΄ λ κ²μΌλ‘ κΈ°λλκ³ μμ΅λλ€. κ·Έλ¬λ μ½λ‘λ19 μ΄μ μ κ²½μ μ±μ₯λ₯ μ ν볡νκΈ° μν΄μλ μκ°μ΄ 걸릴 μ μμ΅λλ€. μ₯κΈ°μ μΌλ‘λ μ μ±μ₯κ³Ό κ³ μΈνλ μ΄μ
μ΄ κ³μλ μ μλ μνλ μμ΅λλ€. λ°λΌμ κ°κ΅μ μ½λ‘λ19 μ΄νμ μΈκ³μμ μλ‘μ΄ κ²½μ λͺ¨λΈμ λͺ¨μνκ³ , λμ§νΈνμ λ
Ήμ κ²½μ μ νμ κ°μννλ λ± λ―Έλμ λλΉνλ λ
Έλ ₯μ΄ νμν©λλ€.