Twitter-roBERTa-base fine-tuned using masked language modelling

This is a RoBERTa-base model finetuned (domain adaptation) on ~2M tweets from Jin 2009 (sentiment140). This is the first step of a two steps approach to finetune for sentiment analysis (ULMFit) This model is suitable for English.

Main charachetistics:

  • pretrained model and tokenizer: distillroberta-base
  • no cleaning/processing applied to the data

Reference Paper: ULMFit. Reference dataset: Sentiment140 Git Repo: TBD Labels: 0 -> Negative; 1 -> Positive

Downloads last month
17
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train andrea-t94/roberta-fine-tuned-twitter