cardiffnlp
/

twitter-roberta-large-similarity-latest

Text Classification

Inference Endpoints

Model card Files Files and versions Community

antypasd commited on Mar 7, 2024

Commit

d9af032

·

verified ·

1 Parent(s): e2b757e

Upload RobertaForSequenceClassification

Files changed (3) hide show

README.md +4 -12
config.json +3 -3
model.safetensors +2 -2

README.md CHANGED Viewed

@@ -4,19 +4,12 @@ language:
 license: mit
 datasets:
 - cardiffnlp/super_tweeteval
-pipeline_tag: text-classification
-inference:
-  parameters:
-    function_to_apply: none
-widget:
-- text: >-
-    Looooooool what is this story #TalksWithAsh </s> For someone who keeps
-    saying long story short, the story is quite long iyah #TalksWithAsh
 ---
 # cardiffnlp/twitter-roberta-large-similarity-latest
 This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for tweet similarity (regression on two texts) on the _TweetSIM_ dataset of [SuperTweetEval](https://huggingface.co/datasets/cardiffnlp/super_tweeteval).
-The original Twitter-larged RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-2022-154m).
 ## Example
 ```python
@@ -32,9 +25,8 @@ text_2 = 'For someone who keeps saying long story short, the story is quite long
 text_input = f"{text_1} </s> {text_2}"
-pipe = pipeline('text-classification', model=model, tokenizer=tokenizer, function_to_apply="none")
-pipe(text_input)
->> [{'label': 'LABEL_0', 'score': 2.956475019454956}]
 ```

 license: mit
 datasets:
 - cardiffnlp/super_tweeteval
+pipeline_tag: sentence-similarity
 ---
 # cardiffnlp/twitter-roberta-large-similarity-latest
 This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for tweet similarity (regression on two texts) on the _TweetSIM_ dataset of [SuperTweetEval](https://huggingface.co/datasets/cardiffnlp/super_tweeteval).
+The original Twitter-based RoBERTa model can be found [here](https://huggingface.co/cardiffnlp/twitter-roberta-large-2022-154m).
 ## Example
 ```python
 text_input = f"{text_1} </s> {text_2}"
+model(**tokenizer(text_input, return_tensors="pt")).logits
+>>tensor([[2.9565]])
 ```

config.json CHANGED Viewed

@@ -1,7 +1,7 @@
 {
-  "_name_or_path": "../../best_models/troberta-large-tweet-similarity/best_model/",
   "architectures": [
-    "RobertaModel"
   ],
   "attention_probs_dropout_prob": 0.1,
   "bos_token_id": 0,
@@ -27,7 +27,7 @@
   "position_embedding_type": "absolute",
   "problem_type": "regression",
   "torch_dtype": "float32",
-  "transformers_version": "4.35.0",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

 {
+  "_name_or_path": "best_models/troberta-large-tweet-similarity/best_model",
   "architectures": [
+    "RobertaForSequenceClassification"
   ],
   "attention_probs_dropout_prob": 0.1,
   "bos_token_id": 0,
   "position_embedding_type": "absolute",
   "problem_type": "regression",
   "torch_dtype": "float32",
+  "transformers_version": "4.38.2",
   "type_vocab_size": 1,
   "use_cache": true,
   "vocab_size": 50265

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:4009fdcf5cf1c78b2dac7c71418bd73214832454c85addd7a5e00f8d7549cccb
-size 1421483904

 version https://git-lfs.github.com/spec/v1
+oid sha256:639af2e2b0f8cf113f102236523936b14bce097c00dec7ed92aba0a39a4b2174
+size 1421491316