import streamlit as st # Custom CSS for better styling st.markdown(""" """, unsafe_allow_html=True) # Introduction st.markdown('

Sentiment Analysis with Spark NLP

', unsafe_allow_html=True) st.markdown("""

Welcome to the Spark NLP Sentiment Analysis Demo App! Sentiment analysis is an automated process capable of understanding the feelings or opinions that underlie a text. This process is considered a text classification and is one of the most interesting subfields of NLP. Using Spark NLP, it is possible to analyze the sentiment in a text with high accuracy.

This app demonstrates how to use Spark NLP's SentimentDetector to perform sentiment analysis using a rule-based approach.

""", unsafe_allow_html=True) st.image('images/Sentiment-Analysis.jpg',caption="Difference between rule-based and machine learning based sentiment analysis applications", use_column_width='auto') # About Sentiment Analysis st.markdown('

About Sentiment Analysis

', unsafe_allow_html=True) st.markdown("""

Sentiment analysis studies the subjective information in an expression, such as opinions, appraisals, emotions, or attitudes towards a topic, person, or entity. Expressions can be classified as positive, negative, or neutral — in some cases, even more detailed.

Some popular sentiment analysis applications include social media monitoring, customer support management, and analyzing customer feedback.

""", unsafe_allow_html=True) # Using SentimentDetector in Spark NLP st.markdown('

Using SentimentDetector in Spark NLP

', unsafe_allow_html=True) st.markdown("""

The SentimentDetector annotator in Spark NLP uses a rule-based approach to analyze the sentiment in text data. This method involves using a set of predefined rules or patterns to classify text as positive, negative, or neutral.

Spark NLP also provides Machine Learning (ML) and Deep Learning (DL) solutions for sentiment analysis. If you are interested in those approaches, please check the ViveknSentiment and SentimentDL annotators of Spark NLP.

""", unsafe_allow_html=True) st.markdown('

Example Usage in Python

', unsafe_allow_html=True) st.markdown('

Here’s how you can implement sentiment analysis using the SentimentDetector annotator in Spark NLP:

', unsafe_allow_html=True) # Setup Instructions st.markdown('

Setup

', unsafe_allow_html=True) st.markdown('

To install Spark NLP in Python, use your favorite package manager (conda, pip, etc.). For example:

', unsafe_allow_html=True) st.code(""" pip install spark-nlp pip install pyspark """, language="bash") st.markdown("

Then, import Spark NLP and start a Spark session:

", unsafe_allow_html=True) st.code(""" import sparknlp # Start Spark Session spark = sparknlp.start() """, language='python') # load data st.markdown('

Start by loading the Dataset, Lemmas and the Sentiment Dictionary.

', unsafe_allow_html=True) st.code(""" !wget -N https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/resources/en/lemma-corpus-small/lemmas_small.txt -P /tmp !wget -N https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/resources/en/sentiment-corpus/default-sentiment-dict.txt -P /tmp """, language="bash") st.image('images/dataset.png', caption="First few lines of the lemmas and sentiment dictionary", use_column_width='auto') # Sentiment Analysis Example st.markdown('

Example Usage: Sentiment Analysis with SentimentDetector

', unsafe_allow_html=True) st.code(''' from sparknlp.base import DocumentAssembler, Pipeline, Finisher from sparknlp.annotator import ( SentenceDetector, Tokenizer, Lemmatizer, SentimentDetector ) import pyspark.sql.functions as F # Step 1: Transforms raw texts to document annotation document_assembler = ( DocumentAssembler() .setInputCol("text") .setOutputCol("document") ) # Step 2: Sentence Detection sentence_detector = SentenceDetector().setInputCols(["document"]).setOutputCol("sentence") # Step 3: Tokenization tokenizer = Tokenizer().setInputCols(["sentence"]).setOutputCol("token") # Step 4: Lemmatization lemmatizer = ( Lemmatizer() .setInputCols("token") .setOutputCol("lemma") .setDictionary("/tmp/lemmas_small.txt", key_delimiter="->", value_delimiter="\\t") ) # Step 5: Sentiment Detection sentiment_detector = ( SentimentDetector() .setInputCols(["lemma", "sentence"]) .setOutputCol("sentiment_score") .setDictionary("/tmp/default-sentiment-dict.txt", ",") ) # Step 6: Finisher finisher = ( Finisher() .setInputCols(["sentiment_score"]) .setOutputCols(["sentiment"]) ) # Define the pipeline pipeline = Pipeline( stages=[ document_assembler, sentence_detector, tokenizer, lemmatizer, sentiment_detector, finisher, ] ) # Create a spark Data Frame with an example sentence data = spark.createDataFrame( [ ["The restaurant staff is really nice"] ] ).toDF("text") # use the column name `text` defined in the pipeline as input # Fit-transform to get predictions result = pipeline.fit(data).transform(data).show(truncate=50) ''', language='python') st.text(""" +-----------------------------------+----------+ | text| sentiment| +-----------------------------------+----------+ |The restaurant staff is really nice|[positive]| +-----------------------------------+----------+ """) st.markdown("""

The code snippet demonstrates how to set up a pipeline in Spark NLP to perform sentiment analysis on text data using the SentimentDetector annotator. The resulting DataFrame contains the sentiment predictions.

""", unsafe_allow_html=True) # One-liner Alternative st.markdown('

One-liner Alternative

', unsafe_allow_html=True) st.markdown("""

In October 2022, John Snow Labs released the open-source johnsnowlabs library that contains all the company products, open-source and licensed, under one common library. This simplified the workflow, especially for users working with more than one of the libraries (e.g., Spark NLP + Healthcare NLP). This new library is a wrapper on all of John Snow Lab’s libraries and can be installed with pip:

pip install johnsnowlabs

""", unsafe_allow_html=True) st.markdown('

To run sentiment analysis with one line of code, we can simply:

', unsafe_allow_html=True) st.code(""" # Import the NLP module which contains Spark NLP and NLU libraries from johnsnowlabs import nlp sample_text = "The restaurant staff is really nice" # Returns a pandas DataFrame, we select the desired columns nlp.load('en.sentiment').predict(sample_text, output_level='sentence') """, language='python') st.image('images/johnsnowlabs-sentiment-output.png', use_column_width='auto') st.markdown("""

This approach demonstrates how to use the johnsnowlabs library to perform sentiment analysis with a single line of code. The resulting DataFrame contains the sentiment predictions.

""", unsafe_allow_html=True) # Conclusion st.markdown("""

Conclusion

In this app, we demonstrated how to use Spark NLP's SentimentDetector annotator to perform sentiment analysis on text data. These powerful tools enable users to efficiently process large datasets and identify sentiment, providing deeper insights for various applications. By integrating these annotators into your NLP pipelines, you can enhance text understanding, information extraction, and customer sentiment analysis.

""", unsafe_allow_html=True) # References and Additional Information st.markdown('

For additional information, please check the following references.

', unsafe_allow_html=True) st.markdown("""

Documentation : SentimentDetector
Python Docs : SentimentDetector
Scala Docs : SentimentDetector
Example Notebook : Sentiment Analysis

""", unsafe_allow_html=True)