metadata
license: apache-2.0
language:
- en
tags:
- word_sense_disambiguation
library_name: transformers
datasets:
- SemCor
- WordNet
- WSD_Evaluation_Framework
metrics:
- f1
Semantic Specialization for Knowledge-based Word Sense Disambiguation
- This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
- If you want to learn how to use these files, please refer to the semantic_specialization_for_wsd repository.
Trained Model (Projection Heads)
- File: checkpoints/baseline/last.ckpt
- This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
NOTE: Five runs were performed in total. - The main hyperparameters used for training are as follows:
Argument name | Value | Description |
---|---|---|
max_epochs | 15 | Maximum number of training epochs |
cfg_similarity_class.temperature ($\beta^{-1}$) | 0.015625 (=1/64) | Temperature parameter for the contrastive loss |
batch_size ($N_B$) | 256 | Number of samples in each batch for the attract-repel and self-training objectives |
coef_max_pool_margin_loss ($\alpha$) | 0.2 | Coefficient for the self-training loss |
cfg_gloss_projection_head.n_layer | 2 | Number of FFNN layers for the projection heads |
cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) | 0.015 | Hyperparameter for the distance constraint integrated in the projection heads |
Sense/context embeddings
- Directory:
data/bert_embeddings/
- Sense embeddings:
bert-large-cased_WordNet_Gloss_Corpus.hdf5
- Context embeddings for the self-training objective:
bert-large-cased_SemCor.hdf5
- Context embeddings for evaluating the WSD task:
bert-large-cased_WSDEval-ALL.hdf5
Reference
@inproceedings{Mizuki:EACL2023,
title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
author = "Mizuki, Sakae and Okazaki, Naoaki",
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
series = {EACL},
month = may,
year = "2023",
address = "Dubrovnik, Croatia",
publisher = "Association for Computational Linguistics",
pages = "3449--3462",
}
- arXiv version is also available.