yangbang18
/

zeronlg-4langs-mt

sentence-transformers

Model card Files Files and versions Community

zeronlg-4langs-mt / README.md

yangbang18's picture

Create README.md

e4a2037 almost 2 years ago

|

history blame contribute delete

2.26 kB

	---
	language:
	- en
	- zh
	- de
	- fr
	library_name: sentence-transformers
	license: apache-2.0
	---

	# ZeroNLG

	Without any labeled downstream pairs for training, ZeroNLG is an unified framework that deals with multiple natural language generation (NLG) tasks in a zero-shot manner, including image-to-text, video-to-text, and text-to-text generation tasks across English, Chinese, German, and French.

	Pre-trained data: a machine-translated version of [CC3M](https://huggingface.co/datasets/conceptual_captions), including
	- 1.1M English sentences
	- 1.1M English-Chinese pairs
	- 1.1M English-German pairs
	- 1.1M English-French pairs


	Paper: [ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation](https://arxiv.org/abs/2303.06458)

	Authors: Bang Yang\, Fenglin Liu\, Yuexian Zou, Xian Wu, Yaowei Wang, David A. Clifton



	## Quick Start
	Please follow our [github repo](https://github.com/yangbang18/ZeroNLG) to prepare the environment at first.

	```python
	from zeronlg import ZeroNLG

	# Automatically download the model from Huggingface Hub
	# Note: this model is especially pre-trained for machine translation
	model = ZeroNLG('zeronlg-4langs-mt')

	# Translating English into Chinese
	# Note: the multilingual encoder is langauge-agnostic, so the `lang` below means the langauge to be generated
	output = model.forward_translate(texts='a girl and a boy are playing', lang='zh', num_beams=3)
	# output = "一个女孩和一个男孩一起玩"
	```

	## Zero-Shot Performance
	### Machine translation
	Model: [zeronlg-4langs-mt](https://huggingface.co/yangbang18/zeronlg-4langs-mt) only.

	\| En->Zh \| En<-Zh \| En->De \| En<-De \| En->Fr \| En<-Fr \| Zh->De \| Zh<-De \| Zh->Fr \| Zh<-Fr \| De->Fr \| De<-Fr\|
	\| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \| :---: \|
	6.0\|9.2\|21.6\|23.2\|27.2\|26.8\|7.8\|4.6\|6.1\|9.7\|20.9\|19.6


	## Citation
	```bibtex
	@article{Yang2023ZeroNLG,
	title={ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation},
	author={Yang, Bang and Liu, Fenglin and Zou, Yuexian and Wu, Xian and Wang, Yaowei and Clifton, David A.},
	journal={arXiv preprint arXiv:2303.06458}
	year={2023}
	}
	```