|
--- |
|
license: apache-2.0 |
|
language: |
|
- zh |
|
library_name: transformers |
|
pipeline_tag: text-generation |
|
inference: |
|
parameters: |
|
temperature: 0.7 |
|
top_p: 0.6 |
|
repetition_penalty: 1.1 |
|
max_new_tokens: 128 |
|
num_return_sequences: 3 |
|
do_sample: true |
|
tags: |
|
- art |
|
widget: |
|
- 笔底江山助磅礴 |
|
- (唐诗:秋思)诗词 |
|
- (宋词:浣溪沙)秋 |
|
- (对联)冬 |
|
|
|
|
|
|
|
--- |
|
|
|
# Chinese Poem and Couplt small GPT2 Model |
|
|
|
## Model description |
|
|
|
The model is used to generate Chinese ancient poems and couplets. It is based on the [IDEA-CCNL/Wenzhong-GPT2-110M](https://huggingface.co/IDEA-CCNL/Wenzhong-GPT2-110M) |
|
|
|
|
|
## How to use |
|
|
|
You can use the model directly with a pipeline for text generation: |
|
|
|
When the parameter skip_special_tokens is True: |
|
|
|
```python |
|
>>> from transformers import BertTokenizer, GPT2LMHeadModel,TextGenerationPipeline |
|
>>> tokenizer = BertTokenizer.from_pretrained("snzhang/GPT2-Poem-Small") |
|
>>> model = GPT2LMHeadModel.from_pretrained("snzhang/GPT2-Poem-Small") |
|
>>> text_generator = TextGenerationPipeline(model, tokenizer) |
|
>>> text_generator("笔底江山助磅礴", max_length=50, do_sample=True) |
|
[{'generated_text':'笔底江山助磅礴,万卷诗书见成章。'}] |
|
``` |
|
|
|
And you can add the prefix "(唐诗:your title)"、"(宋词:your title)" and "(对联)" to make generation more precise. |
|
|
|
## Training data |
|
|
|
Training data contains 71,334 Chinese ancient poems and couplets which are collected by [Chinese Poetry](https://github.com/chinese-poetry/chinese-poetry) and [Couplet Dataset](https://github.com/wb14123/couplet-dataset) |
|
|
|
## More Details |
|
|
|
You can get more details in [GPT2-Poem-Small](https://github.com/h7nian/GPT2-Poem-Small) |
|
|