antypasd commited on
Commit
f1f689f
·
1 Parent(s): 3b2e902

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md ADDED
@@ -0,0 +1,43 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - cardiffnlp/super_tweeteval
5
+ language:
6
+ - en
7
+ pipeline_tag: sentence-similarity
8
+ ---
9
+ # cardiffnlp/twitter-roberta-large-latest-tweet-similarity
10
+
11
+ This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for tweet similarity (regression on two texts) on the _TweetSIM_ dataset of [SuperTweetEval](https://huggingface.co/datasets/cardiffnlp/super_tweeteval).
12
+ The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-2022-154m).
13
+
14
+ ## Example
15
+ ```python
16
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
17
+
18
+ model_name = "cardiffnlp/twitter-roberta-large-latest-tweet-intimacy"
19
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
20
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
21
+
22
+
23
+ text_1 = 'Looooooool what is this story #TalksWithAsh'
24
+ text_2 = 'For someone who keeps saying long story short, the story is quite long iyah #TalksWithAsh'
25
+
26
+ text_input = f"{text_1} </s> {text_2}"
27
+
28
+ model(**tokenizer(text_input, return_tensors="pt")).logits
29
+ >>tensor([[2.9565]])
30
+ ```
31
+
32
+
33
+ ## Citation Information
34
+ Please cite the [reference paper](https://arxiv.org/abs/2310.14757) if you use this model.
35
+
36
+ ```bibtex
37
+ @inproceedings{antypas2023supertweeteval,
38
+ title={SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research},
39
+ author={Dimosthenis Antypas and Asahi Ushio and Francesco Barbieri and Leonardo Neves and Kiamehr Rezaee and Luis Espinosa-Anke and Jiaxin Pei and Jose Camacho-Collados},
40
+ booktitle={Findings of the Association for Computational Linguistics: EMNLP 2023},
41
+ year={2023}
42
+ }
43
+ ```