File size: 2,782 Bytes
3018b8c
67716f9
 
 
 
 
 
 
3018b8c
67716f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
---
language:
- ja
tags:
- causal-lm
- not-for-all-audiences
- nsfw
pipeline_tag: text-generation
---

# Hameln Japanese Mistral 7B

<img src="OIG2.FjvlnWCtZSEmMxLq.jpg" alt="drawing" style="width:512px;"/>

## Model Description

This is a 7B-parameter decoder-only Japanese language model fine-tuned on novel datasets, built on top of the base model Japanese Stable LM Base Gamma 7B. [Japanese Stable LM Instruct Gamma 7B](https://huggingface.co/stabilityai/japanese-stablelm-instruct-gamma-7b)

## Usage

Ensure you are using Transformers 4.34.0 or newer.

```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Elizezen/Hameln-japanese-mistral-7B")
model = AutoModelForCausalLM.from_pretrained(
  "Elizezen/Hameln-japanese-mistral-7B",
  torch_dtype="auto",
)
model.eval()

if torch.cuda.is_available():
    model = model.to("cuda")

input_ids = tokenizer.encode(
    "ใ‚€ใ‹ใ—ใ‚€ใ‹ใ—ใ€ใ‚ใ‚‹ใจใ“ใ‚ใซใ€ใŠใ˜ใ„ใ•ใ‚“ใจใŠใฐใ‚ใ•ใ‚“ใŒไฝใ‚“ใงใ„ใพใ—ใŸใ€‚ ใŠใ˜ใ„ใ•ใ‚“ใฏๅฑฑใธๆŸดๅˆˆใ‚Šใซใ€",
    add_special_tokens=True, 
    return_tensors="pt"
)

tokens = model.generate(
    input_ids.to(device=model.device),
    max_new_tokens=512,
    temperature=1,
    top_p=0.95,
    do_sample=True,
)

out = tokenizer.decode(tokens[0][input_ids.shape[1]:], skip_special_tokens=True).strip()
print(out)

"""
output example:
ใ‚€ใ‹ใ—ใ‚€ใ‹ใ—ใ€ใ‚ใ‚‹ใจใ“ใ‚ใซใ€ใŠใ˜ใ„ใ•ใ‚“ใจใŠใฐใ‚ใ•ใ‚“ใŒไฝใ‚“ใงใ„ใพใ—ใŸใ€‚ ใŠใ˜ใ„ใ•ใ‚“ใฏๅฑฑใธๆŸดๅˆˆใ‚Šใซใ€ใŠใฐใ‚ใ•ใ‚“ใฏ็”ฐใ‚“ใผใฎ็จฒใฎๆ‰‹ไผใ„ใ‚’ใ™ใ‚‹ใชใฉใ€ไบŒไบบใงๅŠ›ใ‚’ๅˆใ‚ใ›ใฆๆฅฝใ—ใๆšฎใ‚‰ใ—ใฆใ„ใพใ—ใŸใ€‚
ใ‚ใ‚‹ๆ—ฅใฎใ“ใจใ€ใใฎๅœฐๆ–นไธ€ๅธฏใซๅคงใใชๅฐ้ขจใŒใ‚„ใฃใฆๆฅใพใ—ใŸใ€‚ๅผท้ขจใซ้ฃ›ใฐใ•ใ‚ŒใŸๆœจใ‚„ใ€ๅฎถๅฑ‹ใชใฉใŒๆฌกใ€…ใจๅ€’ใ‚Œใ‚‹ไธญใ€ๅนธใ„ใซใ‚‚ใŠใ˜ใ„ใ•ใ‚“ใจใŠใฐใ‚ใ•ใ‚“ใฎไฝใ‚“ใงใ„ใŸๆ‘ใฏ็„กไบ‹ใงใ—ใŸใ€‚
ใ—ใ‹ใ—ใ€่ฟ‘้šฃใฎๅฐใ•ใชๆ‘ใงใฏ่ขซๅฎณใŒๅ‡บใฆใ„ใพใ—ใŸใ€‚ๅฎถๅฑ‹ใฏๅ…จๅฃŠใ€่พฒไฝœ็‰ฉใฏ่’ใ‚‰ใ•ใ‚Œใ€ไฝ•ใ‚ˆใ‚Šๅคšใใฎๅ‘ฝใŒๅคฑใ‚ใ‚Œใฆใ„ใพใ—ใŸใ€‚
ใ€Œๅฏๅ“€ๆƒณใซโ€ฆโ€ฆใ€
ใŠใฐใ‚ใ•ใ‚“ใฏๅฟƒใ‚’็—›ใ‚ใ€็ฅžๆง˜ใซ็ฅˆใ‚Šใ‚’ๆงใ’็ถšใ‘ใพใ—ใŸใ€‚
ใ€ŒๅคฉไธŠใฎ็ฅžๆง˜๏ผใฉใ†ใ‹ใ€็ง้”ไบบ้–“ใ‚’ๅฎˆใฃใฆไธ‹ใ•ใ„๏ผใ€
ใŠใฐใ‚ใ•ใ‚“ใฎ็ฅˆใ‚ŠใŒ้€šใ˜ใŸใฎใ‹ใ€ๅฐ้ขจใฏๆ€ฅ้€Ÿใซๅ‹ขๅŠ›ใ‚’่ฝใจใ—ใ€่ขซๅฎณใฏๆœ€ๅฐ้™ใฎๅ†…ใซๆฒปใพใ‚Šใพใ—ใŸใ€‚
"""

```

### Datasets

- less than 1GB of web novels(non-PG)
- 70GB of web novels(PG)

### Intended Use

The primary purpose of this language model is to assist in generating novels. While it can handle various prompts, it may not excel in providing instruction-based responses. Note that the model's responses are not censored, and occasionally sensitive content may be generated.