KomeijiForce commited on
Commit
1eba85a
Β·
1 Parent(s): c901bea

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +52 -1
README.md CHANGED
@@ -6,4 +6,55 @@ language:
6
  metrics:
7
  - bertscore
8
  pipeline_tag: text2text-generation
9
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  metrics:
7
  - bertscore
8
  pipeline_tag: text2text-generation
9
+ ---
10
+
11
+ # EmojiLM
12
+
13
+ This is a [BART](facebook/bart-large) model pre-trained on the [Test2Emoji](https://huggingface.co/datasets/KomeijiForce/Text2Emoji) dataset to translate setences into series of emojis.
14
+
15
+ For instance, "I love pizza" will be translated into "πŸ•πŸ˜".
16
+
17
+ An example implementation for translation:
18
+
19
+ ```python
20
+ from transformers import BartTokenizer, BartForConditionalGeneration
21
+
22
+ def translate(sentence, **argv):
23
+ inputs = tokenizer(sentence, return_tensors="pt")
24
+ generated_ids = generator.generate(inputs["input_ids"], **argv)
25
+ decoded = tokenizer.decode(generated_ids[0], skip_special_tokens=True).replace(" ", "")
26
+ return decoded
27
+
28
+ path = "KomeijiForce/bart-large-emojilm"
29
+ tokenizer = BartTokenizer.from_pretrained(path)
30
+ generator = BartForConditionalGeneration.from_pretrained(path)
31
+
32
+ sentence = "I love the weather in Alaska!"
33
+ decoded = translate(sentence, num_beams=4, do_sample=True, max_length=100)
34
+ print(decoded)
35
+ ```
36
+
37
+ You will probably get some output like "β„οΈπŸ”οΈπŸ˜".
38
+
39
+ If you find this model & dataset resource useful, please consider cite our paper:
40
+
41
+ ```
42
+ @article{DBLP:journals/corr/abs-2311-01751,
43
+ author = {Letian Peng and
44
+ Zilong Wang and
45
+ Hang Liu and
46
+ Zihan Wang and
47
+ Jingbo Shang},
48
+ title = {EmojiLM: Modeling the New Emoji Language},
49
+ journal = {CoRR},
50
+ volume = {abs/2311.01751},
51
+ year = {2023},
52
+ url = {https://doi.org/10.48550/arXiv.2311.01751},
53
+ doi = {10.48550/ARXIV.2311.01751},
54
+ eprinttype = {arXiv},
55
+ eprint = {2311.01751},
56
+ timestamp = {Tue, 07 Nov 2023 18:17:14 +0100},
57
+ biburl = {https://dblp.org/rec/journals/corr/abs-2311-01751.bib},
58
+ bibsource = {dblp computer science bibliography, https://dblp.org}
59
+ }
60
+ ```