model-attribution-challenge
/

fairseq-dense-125M

Text Generation

Inference Endpoints

Model card Files Files and versions Community

fairseq-dense-125M / README.md

ve-forbryderne's picture

Add basic model information

c8fb975 over 2 years ago

|

history blame contribute delete

408 Bytes

	---
	language: en
	---
	This is a Hugging Face transformers-compatible conversion of the original dense 125M-parameter model from the paper "[Efficient Large Scale Language Modeling with Mixtures of Experts](https://arxiv.org/abs/2112.10684)" from Artetxe et al. Please refer to the original model card, which can be found at https://github.com/facebookresearch/fairseq/blob/main/examples/moe_lm/model_card.md.