clampert
/

multilingual-sentiment-covid19

Text Classification

sentiment-analysis

Inference Endpoints

Model card Files Files and versions Community

multilingual-sentiment-covid19 / README.md

clampert's picture

Add widget to model card

01478d2 about 3 years ago

|

1.65 kB

	---
	pipeline_tag: text-classification
	language: multilingual
	license: apache-2.0
	tags:
	- "sentiment-analysis"
	- "multilingual"
	widget:
	- text: "I am very happy."
	example_title: "English"
	- text: "Heute bin ich schlecht drauf."
	example_title: "Deutsch"
	- text: "Quel cauchemard!"
	example_title: "Francais"
	- text: "ฉันรักฤดูใบไม้ผลิ"
	example_title: "ภาษาไทย"
	---

	# Multi-lingual sentiment prediction trained from COVID19-related tweets

	Repository: [https://github.com/clampert/multilingual-sentiment-analysis/](https://github.com/clampert/multilingual-sentiment-analysis/)

	Model trained on a large-scale (18437530 examples) dataset of
	multi-lingual tweets that was collected between March 2020
	and November 2021 using Twitter’s Streaming API with varying
	COVID19-related keywords. Labels were auto-general based on
	the presence of positive and negative emoticons. For details
	on the dataset, see our IEEE BigData 2021 publication.

	Base model is [sentence-transformers/stsb-xlm-r-multilingual](https://huggingface.co/sentence-transformers/stsb-xlm-r-multilingual).
	It was finetuned for sequence classification with `positive`
	and `negative` labels for two epochs (48 hours on 8xP100 GPUs).

	## Citation

	If you use our model your work, please cite:

	```
	@inproceedings{lampert2021overcoming,
	title={Overcoming Rare-Language Discrimination in Multi-Lingual Sentiment Analysis},
	author={Jasmin Lampert and Christoph H. Lampert},
	booktitle={IEEE International Conference on Big Data (BigData)},
	year={2021},
	note={Special Session: Machine Learning on Big Data},
	}
	```

	Enjoy!