--- license: apache-2.0 --- T5-base fine-tuned on CNN/DM Summarization dataset. Training args: ``` { "learning_rate": 0.0001, "logging_steps": 5000, "lr_scheduler_type": "cosine", "num_train_epochs": 2, "per_device_train_batch_size": 16, # total batch size of 48 "save_total_limit": 1, "weight_decay": 0.1 } ``` Generation kwargs: ``` { "do_sample": true, "max_new_tokens": 100, "min_length": 50, "temperature": 0.7, "top_k": 0 }, ```` Pre-processing: Append prompt with prefix "Summarize: " Post-processing: None Test split metrics: ``` {"lexical/meteor": 0.30857827917561603, "lexical/rouge_rouge1": 0.41099971702474514, "lexical/rouge_rouge2": 0.17676173608661166, "lexical/rouge_rougeL": 0.2759112075051335, "lexical/rouge_rougeLsum": 0.34316108028094616, "lexical/bleu": 0.10747816852428271, "semantic/bert_score": 0.8760301497472277} ```