Not-For-All-Audiences

nsfw

Model card Files Files and versions Community

File size: 2,782 Bytes

---
language:
- ja
tags:
- causal-lm
- not-for-all-audiences
- nsfw
pipeline_tag: text-generation
---

# Hameln Japanese Mistral 7B

<img src="OIG2.FjvlnWCtZSEmMxLq.jpg" alt="drawing" style="width:512px;"/>

## Model Description

This is a 7B-parameter decoder-only Japanese language model fine-tuned on novel datasets, built on top of the base model Japanese Stable LM Base Gamma 7B. [Japanese Stable LM Instruct Gamma 7B](https://huggingface.co/stabilityai/japanese-stablelm-instruct-gamma-7b)

## Usage

Ensure you are using Transformers 4.34.0 or newer.

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Elizezen/Hameln-japanese-mistral-7B")
model = AutoModelForCausalLM.from_pretrained(
  "Elizezen/Hameln-japanese-mistral-7B",
  torch_dtype="auto",
)
model.eval()

if torch.cuda.is_available():
    model = model.to("cuda")

input_ids = tokenizer.encode(
    "むかしむかし、あるところに、おじいさんとおばあさんが住んでいました。 おじいさんは山へ柴刈りに、",
    add_special_tokens=True, 
    return_tensors="pt"
)

tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=512,
    temperature=1,
    top_p=0.95,
    do_sample=True,
)

out = tokenizer.decode(tokens[0][input_ids.shape[1]:], skip_special_tokens=True).strip()
print(out)

"""
output example:
むかしむかし、あるところに、おじいさんとおばあさんが住んでいました。 おじいさんは山へ柴刈りに、おばあさんは田んぼの稲の手伝いをするなど、二人で力を合わせて楽しく暮らしていました。
ある日のこと、その地方一帯に大きな台風がやって来ました。強風に飛ばされた木や、家屋などが次々と倒れる中、幸いにもおじいさんとおばあさんの住んでいた村は無事でした。
しかし、近隣の小さな村では被害が出ていました。家屋は全壊、農作物は荒らされ、何より多くの命が失われていました。
「可哀想に……」
おばあさんは心を痛め、神様に祈りを捧げ続けました。
「天上の神様！どうか、私達人間を守って下さい！」
おばあさんの祈りが通じたのか、台風は急速に勢力を落とし、被害は最小限の内に治まりました。
"""

```

### Datasets

- less than 1GB of web novels(non-PG)
- 70GB of web novels(PG)

### Intended Use

The primary purpose of this language model is to assist in generating novels. While it can handle various prompts, it may not excel in providing instruction-based responses. Note that the model's responses are not censored, and occasionally sensitive content may be generated.