cis-lmu
/

glot500-base

Inference Endpoints

Model card Files Files and versions Community

kargaranamir commited on Aug 28, 2023

Commit

634773e

•

1 Parent(s): 2f01166

Update README.md

Files changed (1) hide show

README.md +17 -27

README.md CHANGED Viewed

@@ -24,41 +24,31 @@ You can use this model directly with a pipeline for masked language modeling:
 Here is how to use this model to get the features of a given text in PyTorch:
 ```python
-from transformers import AutoTokenizer, AutoModelForMaskedLM
-tokenizer = AutoTokenizer.from_pretrained('cis-lmu/glot500-base')
-model = AutoModelForMaskedLM.from_pretrained("cis-lmu/glot500-base")
-# prepare input
-text = "Replace me by any text you'd like."
-encoded_input = tokenizer(text, return_tensors='pt')
-# forward pass
-output = model(**encoded_input)
 ```
 ### BibTeX entry and citation info
 ```bibtex
 @inproceedings{imanigooghari-etal-2023-glot500,
-    title = "Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages",
-    author = {ImaniGooghari, Ayyoob  and
-      Lin, Peiqin  and
-      Kargaran, Amir Hossein  and
-      Severini, Silvia  and
-      Jalili Sabet, Masoud  and
-      Kassner, Nora  and
-      Ma, Chunlan  and
-      Schmid, Helmut  and
-      Martins, Andr{\'e}  and
-      Yvon, Fran{\c{c}}ois  and
-      Sch{\"u}tze, Hinrich},
-    booktitle = "Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
-    month = jul,
-    year = "2023",
-    address = "Toronto, Canada",
-    publisher = "Association for Computational Linguistics",
-    url = "https://aclanthology.org/2023.acl-long.61",
-    pages = "1082--1117",
 }
 ```

 Here is how to use this model to get the features of a given text in PyTorch:
 ```python
+>>> from transformers import AutoTokenizer, AutoModelForMaskedLM
+>>> tokenizer = AutoTokenizer.from_pretrained('cis-lmu/glot500-base')
+>>> model = AutoModelForMaskedLM.from_pretrained("cis-lmu/glot500-base")
+>>> # prepare input
+>>> text = "Replace me by any text you'd like."
+>>> encoded_input = tokenizer(text, return_tensors='pt')
+>>> # forward pass
+>>> output = model(**encoded_input)
 ```
 ### BibTeX entry and citation info
 ```bibtex
 @inproceedings{imanigooghari-etal-2023-glot500,
+	title        = {Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages},
+	author       = {ImaniGooghari, Ayyoob  and Lin, Peiqin  and Kargaran, Amir Hossein  and Severini, Silvia  and Jalili Sabet, Masoud  and Kassner, Nora  and Ma, Chunlan  and Schmid, Helmut  and Martins, Andr{\'e}  and Yvon, Fran{\c{c}}ois  and Sch{\"u}tze, Hinrich},
+	year         = 2023,
+	month        = jul,
+	booktitle    = {Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
+	publisher    = {Association for Computational Linguistics},
+	address      = {Toronto, Canada},
+	pages        = {1082--1117},
+	url          = {https://aclanthology.org/2023.acl-long.61}
 }
 ```