What is Roberta-Earning-Call-Transcript-Classification Model?
Roberta-Earning-Call-Transcript-Classification is a Multi-Label Classification Model trained with Annotated earning call transcript data. Roberta-base model was fine-tuned to train on earning call transcript data. This model could be very helpful in finding Negative, Positive, Litigious, Constraining and Uncertain thing in the sentence. This could be really helpful in analyzing Profit warning of a company.
What is RoBERTa
RoBERTa builds on BERT’s language masking strategy and modifies key hyperparameters in BERT, including removing BERT’s next-sentence pretraining objective, and training with much larger mini-batches and learning rates. RoBERTa was also trained on an order of magnitude more data than BERT, for a longer amount of time. This allows RoBERTa representations to generalize even better to downstream tasks compared to BERT.
What is Earning Call Transcript?
An earnings call is a teleconference, or webcast, in which a public company discusses the financial results of a reporting period. The name comes from earnings per share, the bottom line number in the income statement divided by the number of shares outstanding.
Example of Earning call Transcipt: https://www.fool.com/earnings/call-transcripts/2022/04/29/apple-aapl-q2-2022-earnings-call-transcript
Scraped 10 years of earning call transcript data for 10 companies like Apple, google, microsoft, Nvidia, Amazon, Intel, Cisco etc. Annotate the data in various categories of sentences like Negative, Positive, Litigious, Constraining and Uncertainty
And then used Loughran-McDonald sentiment lexicon and Use FinancialPhraseBank [Malo, P., Sinha, A., Korhonen, P., Wallenius, J., & Takala, P. (2014). Good debt or bad debt: Detecting semantic orientations in economic texts. Journal of the Association for Information Science and Technology, 65(4), 782-796.] for data annotation.
Hyperparameters
Parameter | |
---|---|
Learning rate | 1e-5 |
Epochs | 12 |
Max Seq Length | 240 |
Batch size | 128 |
Results
Best Result of Micro F1
- 82.8%
Usage
from transformers import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("NLPScholars/Roberta-Earning-Call-Transcript-Classification")
model = AutoModelForSequenceClassification.from_pretrained("NLPScholars/Roberta-Earning-Call-Transcript-Classification")
Contributors
Sumit Ranjan- [email protected],
Aanchal Varma- [email protected],
Akshul Mittal- [email protected]
- Downloads last month
- 30