Spaces:

mshukor
/

UnIVAL

Running

mshukor

init

26fd00c over 1 year ago

1.27 kB

	# Language Modeling with Gated Convolutional Networks (Dauphin et al., 2017)

	## Example usage

	First download and preprocess the data following the main [language modeling README](README.md).

	Then to train a convolutional LM using the `fconv_lm_dauphin_wikitext103`
	architecture:
	```bash
	fairseq-train --task language_modeling \
	data-bin/wikitext-103 \
	--save-dir checkpoints/fconv_wikitext-103 \
	--arch fconv_lm_dauphin_wikitext103 \
	--adaptive-softmax-cutoff 10000,20000,200000 \
	--dropout 0.2 \
	--criterion adaptive_loss \
	--optimizer nag --clip-norm 0.1 --weight-decay 5e-06 \
	--lr 1.0 --lr-scheduler reduce_lr_on_plateau --lr-shrink 0.5 \
	--max-tokens 1024 --tokens-per-sample 1024 \
	--ddp-backend legacy_ddp \
	--max-epoch 35
	```

	And evaluate with:
	```bash
	fairseq-eval-lm data-bin/wikitext-103 --path checkpoints/fconv_wiki103/checkpoint_best.pt
	```

	## Citation

	```bibtex
	@inproceedings{dauphin2017language,
	title={Language Modeling with Gated Convolutional Networks},
	author={Dauphin, Yann N and Fan, Angela and Auli, Michael and Grangier, David},
	booktitle={Proceedings of the 34th International Conference on Machine Learning-Volume 70},
	pages={933--941},
	year={2017},
	organization={JMLR}
	}
	```