huseinzol05's picture
Update README.md
eb60d80
metadata
language:
  - ms
tags:
  - paraphrase
metrics:
  - sacrebleu

finetune-paraphrase-t5-tiny-standard-bahasa-cased

Finetuned T5 tiny on MS paraphrase tasks.

Dataset

  1. translated PAWS, https://huggingface.co/datasets/mesolitica/translated-PAWS
  2. translated MRPC, https://huggingface.co/datasets/mesolitica/translated-MRPC
  3. translated ParaSCI, https://huggingface.co/datasets/mesolitica/translated-paraSCI

Finetune details

  1. Finetune using single RTX 3090 Ti.

Scripts at https://github.com/huseinzol05/malaya/tree/master/session/paraphrase/hf-t5

Supported prefix

  1. parafrasa: {string}, for MS paraphrase.

Evaluation

Evaluated on MRPC validation set and ParaSCI Arxiv test set.

{'name': 'BLEU',
 'score': 36.92696648298233,
 '_mean': -1.0,
 '_ci': -1.0,
 '_verbose': '62.5/42.3/33.0/26.9 (BP = 0.943 ratio = 0.945 hyp_len = 95496 ref_len = 101064)',
 'bp': 0.9433611337299734,
 'counts': [59650, 38055, 27875, 21217],
 'totals': [95496, 89952, 84408, 78864],
 'sys_len': 95496,
 'ref_len': 101064,
 'precisions': [62.46334925023038,
  42.30589647812167,
  33.02412093640413,
  26.90327652667884],
 'prec_str': '62.5/42.3/33.0/26.9',
 'ratio': 0.944906198052719}