Fill-Mask
Transformers
Safetensors
Japanese
modernbert
speed commited on
Commit
d6a6844
·
verified ·
1 Parent(s): 1cf964c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -11,6 +11,39 @@ This model is based on the [modernBERT-base](https://arxiv.org/abs/2412.13663) a
11
  It was trained using the Japanese subset (3.4TB) of the llm-jp-corpus v4 and supports a max sequence length of 8192.
12
 
13
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  ## Training
15
 
16
  This model was trained with a max_seq_len of 1024 in stage 1, and then with a max_seq_len of 8192 in stage 2.
 
11
  It was trained using the Japanese subset (3.4TB) of the llm-jp-corpus v4 and supports a max sequence length of 8192.
12
 
13
 
14
+ ## Usage
15
+
16
+ Please install the transformers library.
17
+ ```bash
18
+ pip install "transformers>=4.48.0"
19
+ ```
20
+
21
+ If your GPU supports flash-attn 2, it is recommended to install flash-attn.
22
+ ```
23
+ pip install flash-attn --no-build-isolation
24
+ ```
25
+
26
+ Using AutoModelForMaskedLM:
27
+ ```
28
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
29
+
30
+ model_id = "speed/llm-jp-modernbert-base-v4-ja-stage2-200k"
31
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
32
+ model = AutoModelForMaskedLM.from_pretrained(model_id)
33
+
34
+ text = "日本の首都は<MASK|LLM-jp>です。"
35
+ inputs = tokenizer(text, return_tensors="pt")
36
+ outputs = model(**inputs)
37
+
38
+ # To get predictions for the mask:
39
+ masked_index = inputs["input_ids"][0].tolist().index(tokenizer.mask_token_id)
40
+ predicted_token_id = outputs.logits[0, masked_index].argmax(axis=-1)
41
+ predicted_token = tokenizer.decode(predicted_token_id)
42
+ print("Predicted token:", predicted_token)
43
+ # Predicted token: 東京
44
+ ```
45
+
46
+
47
  ## Training
48
 
49
  This model was trained with a max_seq_len of 1024 in stage 1, and then with a max_seq_len of 8192 in stage 2.