johntsi
/

nllb-200-distilled-600M_mustc_en-to-8

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

nllb-200-distilled-600M_mustc_en-to-8 / README.md

johntsi's picture

Update README.md

1c36892 verified 6 months ago

|

history blame contribute delete

2.92 kB

	---
	license: mit
	language:
	- en
	- de
	- fr
	- nl
	- es
	- ru
	- pt
	- ro
	- it
	metrics:
	- bleu
	pipeline_tag: translation
	---
	# Model Name

	This is a multilingually fine-tuned version of [NLLB](https://arxiv.org/abs/2207.04672) based on [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) using the text data of [MuST-C v1.0](https://aclanthology.org/N19-1202/) (En -> 8).

	It is part of the paper [Pushing the Limits of Zero-shot End-to-end Speech Translation](https://arxiv.org/abs/2402.10422). Details for the fine-tuning process are available at Appendix D.

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

	tokenizer = AutoTokenizer.from_pretrained("johntsi/nllb-200-distilled-600M_mustc_en-to-8")
	model = AutoModelForSeq2SeqLM.from_pretrained("johntsi/nllb-200-distilled-600M_mustc_en-to-8")

	model.eval()
	model.to("cuda")

	text = "Translate this text to German."
	inputs = tokenizer(text, return_tensors="pt").to("cuda")
	outputs = model.generate(
	**inputs,
	num_beams=5,
	forced_bos_token_id=tokenizer.lang_code_to_id["deu_Latn"]
	)
	translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(translated_text)
	```

	## Results

	#### BLEU scores on MuST-C v1.0 tst-COMMON

	\| Model \| De \| Es \| Fr \| It \| Nl \| Pt \| Ro \| Ru \| Average \|
	\|:-------------------------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:------:\|:-------:\|
	\| [nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) (original) \| 32.7 \| 36.9 \| 45.2 \| 32.2 \| 36.0 \| 37.4 \| 30.3 \| 21.0 \| 34.0 \|
	\| [nllb-200-distilled-600M_mustc_en-to-8](https://huggingface.co/johntsi/nllb-200-distilled-600M_mustc_en-to-8) \| 34.4 \| 38.8 \| 44.6 \| 34.7 \| 39.0 \| 41.6 \| 32.1 \| 22.4 \| 35.9 \|
	\| [nllb-200-distilled-1.3B](https://huggingface.co/facebook/nllb-200-distilled-1.3B) (original) \| 34.6 \| 38.6 \| 46.8 \| 33.7 \| 38.2 \| 39.6 \| 31.8 \| 23.2 \| 35.8 \|
	\| [nllb-200-distilled-1.3B_mustc_en-to-8](https://huggingface.co/johntsi/nllb-200-distilled-1.3B_mustc_en-to-8) \| 35.3 \| 39.9 \| 45.8 \| 36.0 \| 40.6 \| 43.1 \| 32.6 \| 23.9 \| 37.2 \|

	## Citation

	If you find these models useful for your research, please cite our paper :)

	```
	@inproceedings{tsiamas-etal-2024-pushing,
	title = {{Pushing the Limits of Zero-shot End-to-End Speech Translation}},
	author = "Tsiamas, Ioannis and
	G{\'a}llego, Gerard and
	Fonollosa, Jos{\'e} and
	Costa-juss{\`a}, Marta",
	editor = "Ku, Lun-Wei and
	Martins, Andre and
	Srikumar, Vivek",
	booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
	month = aug,
	year = "2024",
	address = "Bangkok, Thailand and virtual meeting",
	publisher = "Association for Computational Linguistics",
	url = "https://aclanthology.org/2024.findings-acl.847",
	pages = "14245--14267",
	}
	```