|
from classifier import classify |
|
from PIL import Image |
|
import streamlit as st |
|
|
|
st.title("Twitter Sentiment Analysis using BERT model") |
|
|
|
st.subheader("Motivation") |
|
st.markdown(""" |
|
Cyberbullying is a serious problem in today's world. It is a form of bullying that takes place using electronic technology. This model will act as an tool for the detection of the abusive content |
|
in the tweets. This model can be used by the social media platforms to detect the abusive content in the tweets and take necessary action. |
|
|
|
Huggingface provides an easy interfce to test the models before the use. |
|
""") |
|
|
|
st.subheader("Play with the model") |
|
|
|
text = st.text_input("Enter a tweet to classify it as either Normal or Abusive. (Press enter to submit)", |
|
value="I love DCNM course", max_chars=512, key=None, type="default", |
|
help=None, autocomplete=None) |
|
st.markdown(f"The tweet is classified as: **{classify(text)}**") |
|
|
|
st.markdown("Try out for abusive _Giving and taking dowry is crappy thing_") |
|
|
|
st.subheader("About the model") |
|
st.markdown(""" |
|
Model was trained on twitter dataset ENCASEH2020 from Founta, A.M et. al. (2018) [3]. BERT Tiny model [1][2][5] was chosen for this project because, empirically, |
|
giving better result with least number of parameters. The model was trained for 10 epochs with batch size of 32 and AdamW optimizer with learning rate of 1e-2 and loss as cross entropy. |
|
""") |
|
|
|
st.image("./images/train_val_accuracy.png", caption="Train and Validation Accuracy", use_column_width=True) |
|
st.image("./images/train_test_scores.png", caption="Classification Report", use_column_width=True) |
|
st.image("./images/confusion_matrix.png", caption="Confusion Matrix", use_column_width=True) |
|
|
|
st.subheader("References") |
|
st.markdown("1. [BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding](https://arxiv.org/abs/1810.04805)") |
|
st.markdown("2. [BERT-Tiny: A Tiny BERT for Natural Language Understanding](https://arxiv.org/abs/1909.10351)") |
|
st.markdown("3. [Founta, A.M., Djouvas, C., Chatzakou, D., Leontiadis, I., Blackburn, J., Stringhini, G., Vakali, A., Sirivianos, M., & Kourtellis, N. (2018).Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior. In 11th International Conference on Web and Social Media, ICWSM 2018.](https://arxiv.org/abs/1802.00393)") |
|
st.markdown("4. [Ajay S, Ram, Kowsik N D, Navaneeth D, Amarnath C N, Cyberbullying Detection using Bidirectional Encoder Representation from Transformers 2022](https://github.com/Cubemet/bert-models)") |
|
st.markdown("5. [Base Model from nreimers](https://huggingface.co/nreimers/BERT-Tiny_L-2_H-128_A-2") |