Uploaded model
- Developed by: kmagai
- License: apache-2.0
- Finetuned from model: llm-jp/llm-jp-3-13b
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
JSONL Output Process
Model Inference Setup
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import torch
from tqdm import tqdm
import json
# QLoRA config for 4-bit quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=False,
)
# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
model_name,
quantization_config=bnb_config,
device_map="auto",
token=HF_TOKEN
)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True, token=HF_TOKEN)
Input Data Processing
The script reads input data from a JSONL file (elyza-tasks-100-TV_0.jsonl
). Each line contains a JSON object with task information:
datasets = []
with open("./elyza-tasks-100-TV_0.jsonl", "r") as f:
item = ""
for line in f:
line = line.strip()
item += line
if item.endswith("}"):
datasets.append(json.loads(item))
item = ""
Generation Process
For each input in the dataset:
- Format the prompt with instruction template
- Tokenize the input
- Generate response using the model
- Decode the output
- Create result object with task_id and output
results = []
for data in tqdm(datasets):
input = data["input"]
prompt = f"""### Instruction
{input}
### Response:
"""
tokenized_input = tokenizer.encode(prompt, add_special_tokens=False, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
tokenized_input,
max_new_tokens=100,
do_sample=False,
repetition_penalty=1.2
)[0]
output = tokenizer.decode(outputs[tokenized_input.size(1):], skip_special_tokens=True)
results.append({"task_id": data["task_id"], "input": input, "output": output})
Generation Parameters
max_new_tokens=100
: Maximum number of tokens to generatedo_sample=False
: Deterministic generation (same output every time)repetition_penalty=1.2
: Penalize repetition in generated text
Output Format
The generated responses are saved in a JSONL file with the following format:
{"task_id": "task_1", "input": "input text", "output": "generated response"}
Required fields:
task_id
: Unique identifier for the taskoutput
: Response generated by the model
Optional fields:
input
: Input text (can be omitted in submission)
Training Data Format
The training data should be provided in JSONL (JSON Lines) format, where each line represents a single JSON object containing the following fields:
{
"instruction": "Task instruction text",
"input": "Input text (optional)",
"output": "Expected output text"
}
Fields Description
instruction
: Task instruction that tells the model what to doinput
: (Optional) Input text that provides specific context for the instructionoutput
: Expected output that represents the ideal response
Example
{"instruction": "以下の文章を要約してください。", "input": "人工知能(AI)は、人間の知能を模倣し、学習、推論、判断などを行うコンピュータシステムです。近年、機械学習や深層学習の発展により、画像認識、自然言語処理、ゲームなど様々な分野で人間に匹敵する、あるいは人間を超える性能を示しています。", "output": "AIは人間の知能を模倣するコンピュータシステムで、機械学習の発展により多くの分野で高い性能を示している。"}
{"instruction": "次の英文を日本語に翻訳してください。", "input": "Artificial Intelligence is transforming the way we live and work.", "output": "人工知能は私たちの生活と仕事の仕方を変革しています。"}
Model tree for kmagai/llm-jp-3-13b-finetune-2
Base model
llm-jp/llm-jp-3-13b