roberta_qa_japanese
(Japanese caption : 日本語の (抽出型) 質問応答のモデル)
This model is a fine-tuned version of rinna/japanese-roberta-base (pre-trained RoBERTa model provided by rinna Co., Ltd.) trained for extractive question answering.
The model is fine-tuned on JaQuAD dataset provided by Skelter Labs, in which data is collected from Japanese Wikipedia articles and annotated by a human.
Intended uses
When running with a dedicated pipeline :
from transformers import pipeline
model_name = "tsmatz/roberta_qa_japanese"
qa_pipeline = pipeline(
"question-answering",
model=model_name,
tokenizer=model_name)
result = qa_pipeline(
question = "決勝トーナメントで日本に勝ったのはどこでしたか。",
context = "日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。",
align_to_words = False,
)
print(result)
When manually running through forward pass :
import torch
import numpy as np
from transformers import AutoModelForQuestionAnswering, AutoTokenizer
model_name = "tsmatz/roberta_qa_japanese"
model = (AutoModelForQuestionAnswering
.from_pretrained(model_name))
tokenizer = AutoTokenizer.from_pretrained(model_name)
def inference_answer(question, context):
question = question
context = context
test_feature = tokenizer(
question,
context,
max_length=318,
)
with torch.no_grad():
outputs = model(torch.tensor([test_feature["input_ids"]]))
start_logits = outputs.start_logits.cpu().numpy()
end_logits = outputs.end_logits.cpu().numpy()
answer_ids = test_feature["input_ids"][np.argmax(start_logits):np.argmax(end_logits)+1]
return "".join(tokenizer.batch_decode(answer_ids))
question = "決勝トーナメントで日本に勝ったのはどこでしたか。"
context = "日本は予選リーグで強豪のドイツとスペインに勝って決勝トーナメントに進んだが、クロアチアと対戦して敗れた。"
answer_pred = inference_answer(question, context)
print(answer_pred)
Training procedure
You can download the source code for fine-tuning from here.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 7e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 32
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.1293 | 0.13 | 150 | 1.0311 |
1.1965 | 0.26 | 300 | 0.6723 |
1.022 | 0.39 | 450 | 0.4838 |
0.9594 | 0.53 | 600 | 0.5174 |
0.9187 | 0.66 | 750 | 0.4671 |
0.8229 | 0.79 | 900 | 0.4650 |
0.71 | 0.92 | 1050 | 0.2648 |
0.5436 | 1.05 | 1200 | 0.2665 |
0.5045 | 1.19 | 1350 | 0.2686 |
0.5025 | 1.32 | 1500 | 0.2082 |
0.5213 | 1.45 | 1650 | 0.1715 |
0.4648 | 1.58 | 1800 | 0.1563 |
0.4698 | 1.71 | 1950 | 0.1488 |
0.4823 | 1.84 | 2100 | 0.1050 |
0.4482 | 1.97 | 2250 | 0.0821 |
0.2755 | 2.11 | 2400 | 0.0898 |
0.2834 | 2.24 | 2550 | 0.0964 |
0.2525 | 2.37 | 2700 | 0.0533 |
0.2606 | 2.5 | 2850 | 0.0561 |
0.2467 | 2.63 | 3000 | 0.0601 |
0.2799 | 2.77 | 3150 | 0.0562 |
0.2497 | 2.9 | 3300 | 0.0516 |
Framework versions
- Transformers 4.23.1
- Pytorch 1.12.1+cu102
- Datasets 2.6.1
- Tokenizers 0.13.1
- Downloads last month
- 136
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.