|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
tags: |
|
- word_sense_disambiguation |
|
library_name: transformers |
|
datasets: |
|
- SemCor |
|
- WordNet |
|
- WSD_Evaluation_Framework |
|
metrics: |
|
- f1 |
|
--- |
|
|
|
# Semantic Specialization for Knowledge-based Word Sense Disambiguation |
|
* This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model. |
|
* If you want to learn how to use these files, please refer to the [semantic_specialization_for_wsd](https://github.com/s-mizuki-nlp/semantic_specialization_for_wsd) repository. |
|
|
|
## Trained Model (Projection Heads) |
|
* File: checkpoints/baseline/last.ckpt |
|
* This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]). |
|
NOTE: Five runs were performed in total. |
|
* The main hyperparameters used for training are as follows: |
|
|
|
| Argument name | Value | Description | |
|
|----------------------------------------------------------------|----------------------------|------------------------------------------------------------------------------------| |
|
| max_epochs | 15 | Maximum number of training epochs | |
|
| cfg_similarity_class.temperature ($\beta^{-1}$) | 0.015625 (=1/64) | Temperature parameter for the contrastive loss | |
|
| batch_size ($N_B$) | 256 | Number of samples in each batch for the attract-repel and self-training objectives | |
|
| coef_max_pool_margin_loss ($\alpha$) | 0.2 | Coefficient for the self-training loss | |
|
| cfg_gloss_projection_head.n_layer | 2 | Number of FFNN layers for the projection heads | |
|
| cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) | 0.015 | Hyperparameter for the distance constraint integrated in the projection heads | |
|
|
|
## Sense/context embeddings |
|
* Directory: `data/bert_embeddings/` |
|
* Sense embeddings: `bert-large-cased_WordNet_Gloss_Corpus.hdf5` |
|
* Context embeddings for the self-training objective: `bert-large-cased_SemCor.hdf5` |
|
* Context embeddings for evaluating the WSD task: `bert-large-cased_WSDEval-ALL.hdf5` |
|
|
|
# Reference |
|
|
|
``` |
|
@inproceedings{Mizuki:EACL2023, |
|
title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation", |
|
author = "Mizuki, Sakae and Okazaki, Naoaki", |
|
booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume", |
|
series = {EACL}, |
|
month = may, |
|
year = "2023", |
|
address = "Dubrovnik, Croatia", |
|
publisher = "Association for Computational Linguistics", |
|
pages = "3449--3462", |
|
} |
|
``` |
|
|
|
* [arXiv version](https://arxiv.org/abs/2304.11340) is also available. |