|
--- |
|
base_model: llm-jp/llm-jp-3-13b |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- trl |
|
license: apache-2.0 |
|
language: |
|
- en |
|
--- |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** SAS3 |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** llm-jp/llm-jp-3-13b |
|
|
|
This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. |
|
|
|
[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth) |
|
|
|
# About This Model |
|
This model is a fine-tuned version of llm-jp/llm-jp-3-13b using the Unsloth library and Hugging Face's TRL (Training Reinforcement Learning) library. It is designed to generate responses based on given instructions in Japanese. |
|
|
|
# Features |
|
Efficient Loading: Utilizes 4-bit quantization for efficient memory usage. |
|
Customizable: Can be used with any JSONL dataset containing an "input" field. |
|
Easy Integration: The provided sample code allows for quick setup and inference. |
|
# Intended Use |
|
Instruction Following: Generate responses to specific instructions or prompts. |
|
Text Generation: Suitable for applications requiring Japanese language text generation. |
|
# Limitations |
|
Language: The model is fine-tuned for Japanese and may not perform well with inputs in other languages. |
|
Biases: As with any AI language model, outputs may contain biases present in the training data. |
|
# How to Cite |
|
If you use this model in your research or applications, please cite it as: |
|
|
|
```bash |
|
SAS3/llm-jp-3-13b-it on Hugging Face |
|
``` |
|
# Contact |
|
For any questions or support, please contact SAS3. |
|
|
|
|
|
# Sample Usage |
|
Below is an example of how to use the uploaded model to generate outputs for any JSONL dataset. This code utilizes the Unsloth library to load the model and perform inference. The generated jsonl file will contain the model's outputs corresponding to your dataset. |
|
|
|
```python |
|
# Install necessary libraries |
|
!pip install unsloth |
|
!pip uninstall unsloth -y |
|
!pip install --upgrade --no-cache-dir "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git" |
|
|
|
from unsloth import FastLanguageModel |
|
import torch |
|
import json |
|
from tqdm import tqdm |
|
import os |
|
|
|
# Load the model and tokenizer |
|
model_name = "SAS3/llm-jp-3-13b-it" |
|
|
|
max_seq_length = 2048 |
|
dtype = None |
|
load_in_4bit = True |
|
|
|
# Replace "YOUR_HF_TOKEN" with your actual Hugging Face token |
|
HF_TOKEN = "YOUR_HF_TOKEN" |
|
|
|
model, tokenizer = FastLanguageModel.from_pretrained( |
|
model_name=model_name, |
|
max_seq_length=max_seq_length, |
|
dtype=dtype, |
|
load_in_4bit=load_in_4bit, |
|
token=HF_TOKEN, |
|
) |
|
FastLanguageModel.for_inference(model) |
|
|
|
# Load your dataset (replace 'your_dataset.jsonl' with your dataset file) |
|
data = [] |
|
with open("your_dataset.jsonl", "r", encoding='utf-8') as f: |
|
item = "" |
|
for line in f: |
|
line = line.strip() |
|
item += line |
|
if item.endswith("}"): |
|
data.append(json.loads(item)) |
|
item = "" |
|
|
|
# Perform inference |
|
results = [] |
|
for dt in tqdm(data): |
|
input_text = dt.get("input", "") |
|
|
|
prompt = f"""### Instruction |
|
{input_text} |
|
### Response |
|
""" |
|
|
|
inputs = tokenizer([prompt], return_tensors="pt").to(model.device) |
|
|
|
outputs = model.generate( |
|
**inputs, |
|
max_new_tokens=512, |
|
use_cache=True, |
|
do_sample=False, |
|
repetition_penalty=1.2 |
|
) |
|
|
|
prediction = tokenizer.decode(outputs[0], skip_special_tokens=True).split('\n### Response')[-1] |
|
|
|
results.append({ |
|
"task_id": dt.get("task_id", ""), |
|
"input": input_text, |
|
"output": prediction |
|
}) |
|
|
|
# Save the results |
|
safe_model_name = os.path.basename(model_name) |
|
|
|
with open(f"./{safe_model_name}_output.jsonl", 'w', encoding='utf-8') as f: |
|
for result in results: |
|
json.dump(result, f, ensure_ascii=False) |
|
f.write('\n') |
|
|
|
|
|
``` |
|
|
|
# Notes |
|
### Hugging Face Token: Replace "YOUR_HF_TOKEN" in the code with your actual Hugging Face access token. You can obtain your token from Hugging Face Account Settings. |
|
|
|
### Dataset: |
|
|
|
- Replace 'your_dataset.jsonl' in the code with the path to your JSONL dataset file. |
|
- Ensure your dataset is in JSON Lines format, where each line is a valid JSON object. |
|
- Each JSON object should at least contain an "input" field. If available, "task_id" or other metadata can also be included. |
|
### Library Installation: The code includes commands to install and upgrade the necessary libraries. If you're running this code in a Jupyter notebook or Google Colab, you can execute these commands directly. |
|
|
|
### Inference Process: |
|
|
|
- The model and tokenizer are loaded using the Unsloth library. |
|
- For each input in your dataset, the code generates a prompt in the following format: |
|
```shell |
|
### Instruction |
|
{input_text} |
|
### Response |
|
``` |
|
- The model generates a response, which is then decoded and appended to the results. |
|
- The final results are saved in a jsonl file named after the model. |
|
### Output Format: The generated jsonl file ({model_name}_output.jsonl) contains the fields task_id, input, and output for each entry. |