agentlans commited on
Commit
725644e
1 Parent(s): 4b6f4cb

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +83 -3
README.md CHANGED
@@ -1,3 +1,83 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DeBERTa-v3 Twitter Sentiment Models
2
+
3
+ This page contains one of two DeBERTa-v3 models (xsmall and base) fine-tuned for Twitter sentiment regression.
4
+
5
+ ## Model Details
6
+
7
+ - **Model Architecture**: DeBERTa-v3
8
+ - **Variants**:
9
+ - xsmall (22M parameters)
10
+ - base (86M parameters)
11
+ - **Task**: Sentiment regression
12
+ - **Language**: English
13
+ - **License**: [Model license]
14
+
15
+ ## Intended Use
16
+
17
+ These models are designed for fine-grained sentiment analysis of English tweets. They output a **continuous sentiment score** rather than discrete categories.
18
+ - negative score means negative sentiment
19
+ - 0 score means neutral sentiment
20
+ - positive score means positive sentiment
21
+ - the absolute value of the score represents how strong that sentiment is
22
+
23
+ ## Training Data
24
+
25
+ The models were fine-tuned on a dataset of English tweets collected between September 2009 and January 2010. The sentiment scores were derived from a meta-analysis of 10 different sentiment classifiers using principal component analysis. Find the dataset at [agentlans/twitter-sentiment-meta-analysis](https://huggingface.co/datasets/agentlans/twitter-sentiment-meta-analysis).
26
+
27
+ ## How to use
28
+
29
+ ```python
30
+ from transformers import AutoTokenizer, AutoModelForSequenceClassification
31
+ import torch
32
+
33
+ model_name="agentlans/deberta-v3-xsmall-tweet-sentiment"
34
+
35
+ # Put model on GPU or else CPU
36
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
37
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
38
+ device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
39
+ model = model.to(device)
40
+
41
+ def sentiment(text):
42
+ """Processes the text using the model and returns its logits.
43
+ In this case, it's interpreted as the sentiment score for that text."""
44
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True).to(device)
45
+ with torch.no_grad():
46
+ logits = model(**inputs).logits.squeeze().cpu()
47
+ return logits.tolist()
48
+
49
+ # Example usage
50
+ text = [x.strip() for x in """
51
+ I absolutely despise this product and regret ever purchasing it.
52
+ The service at that restaurant was terrible and ruined our entire evening.
53
+ I'm feeling a bit under the weather today, but it's not too bad.
54
+ The weather is quite average today, neither good nor bad.
55
+ The movie was okay, I didn't love it but I didn't hate it either.
56
+ I'm looking forward to the weekend, it should be nice to relax.
57
+ This new coffee shop has a really pleasant atmosphere and friendly staff.
58
+ I'm thrilled with my new job and the opportunities it presents!
59
+ The concert last night was absolutely incredible, easily the best I've ever seen.
60
+ I'm overjoyed and grateful for all the love and support from my friends and family.
61
+ """.strip().split("\n")]
62
+
63
+ for x, s in zip(text, sentiment(text)):
64
+ print(f"Text: {x}\nSentiment: {s}\n")
65
+ ```
66
+
67
+ ## Performance
68
+
69
+ Evaluation set RMSE:
70
+ - xsmall: 0.2560
71
+ - base: 0.1938
72
+
73
+ ## Limitations
74
+
75
+ - English language only
76
+ - Trained specifically on tweets, may or may not generalize well to other text types
77
+ - Lack of broader context beyond individual tweets
78
+ - May struggle with detecting sarcasm or nuanced sentiment
79
+
80
+ ## Ethical Considerations
81
+
82
+ - Potential biases in the training data related to the time period and Twitter user demographics
83
+ - Risk of misuse for large-scale sentiment monitoring without consent