hakonmh
/

sentiment-xdistil-uncased

Text Classification

financial-sentiment-analysis

sentiment-analysis

Model card Files Files and versions

sentiment-xdistil-uncased / README.md

hakonmh's picture

Update README.md

66ae755 almost 2 years ago

|

history blame contribute delete

3.59 kB

	---
	license: mit
	language:
	- en
	pipeline_tag: text-classification
	tags:
	- finance
	- financial-sentiment-analysis
	- sentiment-analysis
	library_name: transformers
	widget:
	- text: unemployment hits record low as job opportunities soar
	- text: unemployment hits record high as job opportunities suffers
	---

	`Sentiment-xDistil` is a model based on
	[`xtremedistil-l12-h384-uncased`](https://huggingface.co/microsoft/xtremedistil-l12-h384-uncased)
	fine-tuned for classifying the sentiment of news headlines on a dataset annotated by
	[Chat GPT 3.5](https://platform.openai.com/docs/models/gpt-3-5). It is built, together with
	[`Topic-xDistil`](https://huggingface.co/hakonmh/topic-xdistil-uncased),
	as a tool for filtering out financial news headlines and classifying their sentiment.
	The code used to train both models and build the dataset are found [here](https://github.com/hakonmh/distilnews).

	Notes: The output labels are either `Negative`, `Neutral`, or `Positive`. The model is suitable for English.

	## Performance Results

	Here are the performance metrics for both models on the test set:

	\| Model \| Test Set Size \| Accuracy \| F1 Score \|
	\| --- \| --- \| --- \| --- \|
	\| `topic-xdistil-uncased` \| 32 799 \| 94.44 % \| 92.59 % \|
	\| `sentiment-xdistil-uncased` \| 17 527 \| 94.59 % \| 93.44 % \|

	## Data

	The training data consists of 300k+ news headlines and tweets, and was annotated by
	[Chat GPT 3.5](https://platform.openai.com/docs/models/gpt-3-5), which has shown to
	[outperform crowd-workers for text annotation tasks](https://arxiv.org/pdf/2303.15056.pdf).

	The sentence labels are defined by the Chat GPT prompt as follows:
	```python
	"""
	[...]
	Does the headline convey a Positive, Neutral, or Negative sentiment with \
	regard to the current state or potential future impact on the economy or \
	the asset described?
	- Positive sentiment headlines suggest growth, improvement, or \
	stability in economic conditions.
	- Neutral sentiment headlines do not clearly indicate a positive or \
	negative impact on the economy.
	- Negative sentiment headlines imply economic decline, uncertainty, \
	or unfavorable conditions.
	[...]
	"""
	```

	## Example Usage

	Here's a simple example:

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	model = AutoModelForSequenceClassification.from_pretrained("hakonmh/sentiment-xdistil-uncased")
	tokenizer = AutoTokenizer.from_pretrained("hakonmh/sentiment-xdistil-uncased")

	SENTENCE = "Global Growth Surges as New Technologies Drive Innovation and Productivity!"
	inputs = tokenizer(SENTENCE, return_tensors="pt")
	output = model(**inputs).logits
	predicted_label = model.config.id2label[output.argmax(-1).item()]

	print(predicted_label)
	```

	```text
	Positive
	```

	Or, as a pipeline together with `Topic-xDistil`:

	```python
	from transformers import pipeline

	topic_classifier = pipeline("sentiment-analysis",
	model="hakonmh/topic-xdistil-uncased",
	tokenizer="hakonmh/topic-xdistil-uncased")
	sentiment_classifier = pipeline("sentiment-analysis",
	model="hakonmh/sentiment-xdistil-uncased",
	tokenizer="hakonmh/sentiment-xdistil-uncased")

	SENTENCE = "Global Growth Surges as New Technologies Drive Innovation and Productivity!"
	print(topic_classifier(SENTENCE))
	print(sentiment_classifier(SENTENCE))
	```

	```text
	[{'label': 'Economics', 'score': 0.9970171451568604}]
	[{'label': 'Positive', 'score': 0.9997037053108215}]
	```

	Tested on `transformers` 4.30.1, and `torch` 2.0.0.