okazaki-lab
/

ss_wsd

word_sense_disambiguation

Inference Endpoints

Model card Files Files and versions Community

ss_wsd / README.md

Sakae Mizuki

docs: added arxiv link.

910360f almost 2 years ago

|

history blame contribute delete

3.21 kB

	---
	license: apache-2.0
	language:
	- en
	tags:
	- word_sense_disambiguation
	library_name: transformers
	datasets:
	- SemCor
	- WordNet
	- WSD_Evaluation_Framework
	metrics:
	- f1
	---

	# Semantic Specialization for Knowledge-based Word Sense Disambiguation
	* This repository contains the trained model (projection heads) and sense/context embeddings used for training and evaluating the model.
	* If you want to learn how to use these files, please refer to the [semantic_specialization_for_wsd](https://github.com/s-mizuki-nlp/semantic_specialization_for_wsd) repository.

	## Trained Model (Projection Heads)
	* File: checkpoints/baseline/last.ckpt
	* This is one of the trained models used for reporting the main results (Table 2 in [Mizuki and Okazaki, EACL2023]).
	NOTE: Five runs were performed in total.
	* The main hyperparameters used for training are as follows:

	\| Argument name \| Value \| Description \|
	\|----------------------------------------------------------------\|----------------------------\|------------------------------------------------------------------------------------\|
	\| max_epochs \| 15 \| Maximum number of training epochs \|
	\| cfg_similarity_class.temperature ($\beta^{-1}$) \| 0.015625 (=1/64) \| Temperature parameter for the contrastive loss \|
	\| batch_size ($N_B$) \| 256 \| Number of samples in each batch for the attract-repel and self-training objectives \|
	\| coef_max_pool_margin_loss ($\alpha$) \| 0.2 \| Coefficient for the self-training loss \|
	\| cfg_gloss_projection_head.n_layer \| 2 \| Number of FFNN layers for the projection heads \|
	\| cfg_gloss_projection_head.max_l2_norm_ratio ($\epsilon$) \| 0.015 \| Hyperparameter for the distance constraint integrated in the projection heads \|

	## Sense/context embeddings
	* Directory: `data/bert_embeddings/`
	* Sense embeddings: `bert-large-cased_WordNet_Gloss_Corpus.hdf5`
	* Context embeddings for the self-training objective: `bert-large-cased_SemCor.hdf5`
	* Context embeddings for evaluating the WSD task: `bert-large-cased_WSDEval-ALL.hdf5`

	# Reference

	```
	@inproceedings{Mizuki:EACL2023,
	title = "Semantic Specialization for Knowledge-based Word Sense Disambiguation",
	author = "Mizuki, Sakae and Okazaki, Naoaki",
	booktitle = "Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume",
	series = {EACL},
	month = may,
	year = "2023",
	address = "Dubrovnik, Croatia",
	publisher = "Association for Computational Linguistics",
	pages = "3449--3462",
	}
	```

	* [arXiv version](https://arxiv.org/abs/2304.11340) is also available.