对联

Model description

对联AI生成,给出上联,生成下联。

How to use

使用 pipeline 调用模型:

>>> # 调用微调后的模型
>>> senc="燕子归来,问昔日雕梁何处。 -"
>>> model_id="couplet-gpt2-finetuning"
>>> from transformers import BertTokenizer, GPT2LMHeadModel, TextGenerationPipeline

>>> tokenizer = BertTokenizer.from_pretrained(model_id)
>>> model = GPT2LMHeadModel.from_pretrained(model_id)
>>> text_generator = TextGenerationPipeline(model, tokenizer)   
>>> text_generator.model.config.pad_token_id = text_generator.model.config.eos_token_id
>>> text_generator( senc,max_length=25, do_sample=True)
[{'generated_text': '燕子归来,问昔日雕梁何处。 - 风 儿 吹 醒 , 叹 今 朝 烟 雨 无'}]

Here is how to use this model to get the features of a given text in PyTorch:

from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("supermy/couplet")
model = AutoModelForCausalLM.from_pretrained("supermy/couplet")

Training data

此数据集基于couplet-dataset的70w条数据集,在此基础上利用敏感词词库对数据进行了过滤,删除了低俗或敏感的内容,删除后剩余约74w条对联数据。

统计信息


Training procedure

模型:GPT2 训练环境:英伟达16G显卡

bpe分词:"vocab_size"=50000

[INFO|trainer.py:1608] 2022-12-07 02:32:58,307 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-12-07 02:32:58,307 >>   Num examples = 260926
[INFO|trainer.py:1610] 2022-12-07 02:32:58,307 >>   Num Epochs = 160
[INFO|trainer.py:1611] 2022-12-07 02:32:58,307 >>   Instantaneous batch size per device = 96
[INFO|trainer.py:1612] 2022-12-07 02:32:58,307 >>   Total train batch size (w. parallel, distributed & accumulation) = 96
[INFO|trainer.py:1613] 2022-12-07 02:32:58,307 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-12-07 02:32:58,307 >>   Total optimization steps = 434880
[INFO|trainer.py:1616] 2022-12-07 02:32:58,308 >>   Number of trainable parameters = 124439808
[INFO|trainer.py:1637] 2022-12-07 02:32:58,309 >>   Continuing training from checkpoint, will skip to saved global_step
[INFO|trainer.py:1638] 2022-12-07 02:32:58,310 >>   Continuing training from epoch 93
[INFO|trainer.py:1639] 2022-12-07 02:32:58,310 >>   Continuing training from global step 253500


[INFO|trainer.py:1608] 2022-11-30 12:51:36,357 >> ***** Running training *****
[INFO|trainer.py:1609] 2022-11-30 12:51:36,357 >>   Num examples = 260926
[INFO|trainer.py:1610] 2022-11-30 12:51:36,357 >>   Num Epochs = 81
[INFO|trainer.py:1611] 2022-11-30 12:51:36,357 >>   Instantaneous batch size per device = 96
[INFO|trainer.py:1612] 2022-11-30 12:51:36,357 >>   Total train batch size (w. parallel, distributed & accumulation) = 96
[INFO|trainer.py:1613] 2022-11-30 12:51:36,357 >>   Gradient Accumulation steps = 1
[INFO|trainer.py:1614] 2022-11-30 12:51:36,357 >>   Total optimization steps = 220158
[INFO|trainer.py:1616] 2022-11-30 12:51:36,358 >>   Number of trainable parameters = 124439808

{'loss': 6.1104, 'learning_rate': 4.9888034956712906e-05, 'epoch': 0.18}
{'loss': 5.5855, 'learning_rate': 4.977448014607691e-05, 'epoch': 0.37}
{'loss': 5.3264, 'learning_rate': 4.966092533544091e-05, 'epoch': 0.55}
......
......
......
{'loss': 2.8539, 'learning_rate': 5.677740531799889e-08, 'epoch': 80.94}
{'train_runtime': 146835.0563, 'train_samples_per_second': 143.937, 'train_steps_per_second': 1.499, 'train_loss': 3.1762605669072217, 'epoch': 81.0}
***** train metrics *****
  epoch                    =               81.0
  train_loss               =             3.1763
  train_runtime            = 1 day, 16:47:15.05
  train_samples            =             260926
  train_samples_per_second =            143.937
  train_steps_per_second   =              1.499
12/02/2022 05:38:54 - INFO - __main__ - *** Evaluate ***
[INFO|trainer.py:2929] 2022-12-02 05:38:54,688 >> ***** Running Evaluation *****
[INFO|trainer.py:2931] 2022-12-02 05:38:54,688 >>   Num examples = 1350
[INFO|trainer.py:2934] 2022-12-02 05:38:54,688 >>   Batch size = 96
100%|██████████| 15/15 [00:03<00:00,  4.20it/s]
[INFO|modelcard.py:449] 2022-12-02 05:38:59,875 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}, 'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.4447501469723692}]}
***** eval metrics *****
  epoch                   =       81.0
  eval_accuracy           =     0.4448
  eval_loss               =     3.2813
  eval_runtime            = 0:00:03.86
  eval_samples            =       1350
  eval_samples_per_second =    349.505
  eval_steps_per_second   =      3.883
  perplexity              =    26.6108


{'loss': 3.0967, 'learning_rate': 1.8027961736571009e-07, 'epoch': 159.49}
{'loss': 3.0922, 'learning_rate': 1.227924944812362e-07, 'epoch': 159.68}
{'loss': 3.0934, 'learning_rate': 6.530537159676233e-08, 'epoch': 159.86}
{'train_runtime': 120967.2394, 'train_samples_per_second': 345.12, 'train_steps_per_second': 3.595, 'train_loss': 1.3456422273861828, 'epoch': 160.0}
***** train metrics *****
  epoch                    =             160.0
  train_loss               =            1.3456
  train_runtime            = 1 day, 9:36:07.23
  train_samples            =            260926
  train_samples_per_second =            345.12
  train_steps_per_second   =             3.595
12/08/2022 12:09:08 - INFO - __main__ - *** Evaluate ***
[INFO|trainer.py:2929] 2022-12-08 12:09:08,522 >> ***** Running Evaluation *****
[INFO|trainer.py:2931] 2022-12-08 12:09:08,522 >>   Num examples = 1350
[INFO|trainer.py:2934] 2022-12-08 12:09:08,522 >>   Batch size = 96
100%|██████████| 15/15 [00:03<00:00,  4.16it/s]
[INFO|modelcard.py:449] 2022-12-08 12:09:13,448 >> Dropping the following result as it does not have all the necessary fields:
{'task': {'name': 'Causal Language Modeling', 'type': 'text-generation'}, 'metrics': [{'name': 'Accuracy', 'type': 'accuracy', 'value': 0.433615520282187}]}
***** eval metrics *****
  epoch                   =      160.0
  eval_accuracy           =     0.4336
  eval_loss               =     3.3005
  eval_runtime            = 0:00:03.93
  eval_samples            =       1350
  eval_samples_per_second =    343.164
  eval_steps_per_second   =      3.813
  perplexity              =    27.1257
Downloads last month
13
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using supermy/couplet-gpt2 1