metadata

tags:
  - text2text-generation
  - definition-modeling
metrics:
  - rouge, bleu, bert-f1
model-index:
  - name: flan-t5-definition-en-large
    results: []
language:
  - en
widget:
  - text: He ate a sweet apple. What is the definition of apple?
    example_title: Definition generation
  - text: >-
      The paper contains a number of original ideas about color perception. What
      is the definition of original?
    example_title: Definition generation
license: cc-by-sa-4.0
datasets:
  - marksverdhei/wordnet-definitions-en-2021

FLAN-T5-Definition Large

This model is a version of FLAN-T5 Large finetuned on a dataset of English definitions and usage examples.

It generates definitions of English words in context. Its input is the usage example and the instruction question "What is the definiton of TARGET_WORD?"

This project is a collaboration between the Dialogue Modelling Group at the University of Amsterdam and the Language Technology Group at the University of Oslo.

Sizes:

Model description

See details in the paper Interpretable Word Sense Representations via Definition Generation: The Case of Semantic Change Analysis (ACL'2023) by Mario Giulianelli, Iris Luden, Raquel Fernandez and Andrey Kutuzov.

Intended uses & limitations

The model is intended for research purposes, as a source of contextualized dictionary-like lexical definitions.

The fine-tuning datasets were limited to English. Although the original FLAN-T5 is a multilingual model, we did not thoroughly evaluate its ability to generate definitions in languages other than English.

Generated definitions can contain all sorts of biases and stereotypes, stemming from the underlying language model.

Training and evaluation data

Three datasets were used to fine-tune the model:

WordNet (Ishiwatari et al., NAACL 2019), also available on HF
Oxford dictionary or CHA (Gadetsky et al., ACL 2018)
English subset of CodWoE (Mickus et al., SemEval 2022)

FLAN-T5-Definition Large achieves the following results on the WordNet test set:

BLEU: 14.37
ROUGE-L: 33.74
BERT-F1: 88.21

FLAN-T5-Definition Large achieves the following results on the Oxford dictionary test set:

BLEU: 10.90
ROUGE-L: 30.05
BERT-F1: 87.44

Training procedure

FLAN-T5 Base was fine-tuned in a sequence-to-sequence mode on examples of contextualized dictionary definitions.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 16
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 64
total_eval_batch_size: 128
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 15.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
2.1769	1.0	2740	1.9050	28.7222	9.1873	26.6888	26.6937	11.3429
1.9408	2.0	5480	1.8151	29.8799	10.2327	27.7947	27.8044	11.4165
1.8124	3.0	8220	1.7608	30.9845	10.9982	28.8059	28.8131	11.5310
1.7118	4.0	10960	1.7229	31.6943	11.7412	29.4967	29.5319	11.7037
1.6286	5.0	13700	1.6937	32.5839	12.2431	30.1799	30.206	11.7784
1.5597	6.0	16440	1.6748	32.9915	12.8514	30.7016	30.7145	11.5974
1.4982	7.0	19180	1.6578	33.2157	13.1389	30.9428	30.9519	11.3580
1.4468	8.0	21920	1.6473	33.6146	13.5922	31.3001	31.3235	11.5724
1.4022	9.0	24660	1.6384	34.1711	14.1117	31.7951	31.8066	11.7389
1.364	10.0	27400	1.6337	34.5489	14.5012	32.1329	32.1446	11.6659
1.3321	11.0	30140	1.6291	34.7133	14.7297	32.3042	32.314	11.8003
1.3054	12.0	32880	1.6267	34.9411	15.0282	32.5335	32.5451	11.7619
1.2845	13.0	35620	1.6262	35.1648	15.2154	32.7387	32.742	11.8317
1.2699	14.0	38360	1.6257	35.2849	15.3109	32.8508	32.853	11.8168
1.2595	15.0	41100	1.6273	35.2224	15.2781	32.7718	32.7826	11.7971

Framework versions

Transformers 4.23.1
Pytorch 1.12.1+rocm5.1.1
Datasets 2.4.0
Tokenizers 0.12.1