Chelberta / README.md
UAlbertaUAIS's picture
Update README.md
908d17c verified
---
library_name: peft
base_model: cardiffnlp/twitter-roberta-base-sentiment-latest
license: mit
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- NHL
- Hockey
- Sports
- roberta
- sentiment analysis
---
# Chelberta
This is a finetuned model of [cardiffnlp/twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest) trained on
5168 sentiment labelled reddit comments from subreddits of NHL hockey teams in December 2023. This model is suitable for English.
<b>Labels</b>:
0 -> Negative;
1 -> Neutral;
2 -> Positive
This sentiment analysis has been used for the [NHL Positivity Index](https://uais.dev/projects/nhl-positivity-index/)
The full dataset can be found [here](https://www.kaggle.com/datasets/jacobwinch/nhl-reddit-comments)
## Example Pipeline
```python
from transformers import pipeline, AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel
import torch
model_id = 'cardiffnlp/twitter-roberta-base-sentiment-latest'
peft_model_id = 'UAlbertaUAIS/Chelberta'
model = AutoModelForSequenceClassification.from_pretrained(model_id, num_labels=3)
tokenizer = AutoTokenizer.from_pretrained(model_id, max_length=512)
model = PeftModel.from_pretrained(model, peft_model_id)
model = model.merge_and_unload()
classifier = pipeline("sentiment-analysis", model=model, tokenizer=tokenizer, max_length = 512, truncation=True, device=0)
classifier("Connor McDavid is good at hockey!")
```
```
[{'label': 'positive', 'score': 0.9888942837715149}]
```
- **Developed by:** The Unversity of Alberta Undergraduate Artificial Intelligence Society Student Group
- **Model type:** roberta based
- **Language(s) (NLP):** English
- **License:** MIT
- **Finetuned from model [optional]:** [cardiffnlp/twitter-roberta-base-sentiment-latest](https://huggingface.co/cardiffnlp/twitter-roberta-base-sentiment-latest)
- **Repository:** https://github.com/UndergraduateArtificialIntelligenceClub/NHL-Positivity-Index
## Uses
Chelberta is inteded to be used to analysis the sentiment of sports fans on social media.
## Evaluation
Chelberta was evaluated on a testing dataset of 1000 human labelled NHL Reddit comments from December 2023, the testing set can be found [here](https://github.com/UndergraduateArtificialIntelligenceClub/NHL-Positivity-Index/blob/main/data/training_data/NHL-SentiComments-1K-TEST.json).
The model had an 81.4% accuracy score.
### References
```
@inproceedings{camacho-collados-etal-2022-tweetnlp,
title = "{T}weet{NLP}: Cutting-Edge Natural Language Processing for Social Media",
author = "Camacho-collados, Jose and
Rezaee, Kiamehr and
Riahi, Talayeh and
Ushio, Asahi and
Loureiro, Daniel and
Antypas, Dimosthenis and
Boisson, Joanne and
Espinosa Anke, Luis and
Liu, Fangyu and
Mart{\'\i}nez C{\'a}mara, Eugenio" and others,
booktitle = "Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
month = dec,
year = "2022",
address = "Abu Dhabi, UAE",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.emnlp-demos.5",
pages = "38--49"
}
```
```
@inproceedings{loureiro-etal-2022-timelms,
title = "{T}ime{LM}s: Diachronic Language Models from {T}witter",
author = "Loureiro, Daniel and
Barbieri, Francesco and
Neves, Leonardo and
Espinosa Anke, Luis and
Camacho-collados, Jose",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics: System Demonstrations",
month = may,
year = "2022",
address = "Dublin, Ireland",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-demo.25",
doi = "10.18653/v1/2022.acl-demo.25",
pages = "251--260"
}
```
## Citation
**APA:**
Winch, J., Munjal, T., Lau, H., Bradley, A., Monaghan, A., & Subedi, Y. (2023). NHL Positivity Index. Undergraduate Artificial Intelligence Society. https://uais.dev/projects/nhl-positivity-index/
### Framework versions
- PEFT 0.9.0