Spaces:

spark-nlp
/

Understanding-Intent-and-Actions-in-Commands

Sleeping

App Files Files Community

Understanding-Intent-and-Actions-in-Commands / pages /Workflow & Model Overview.py

abdullahmubeen10

Upload 15 files

f996927 verified 11 months ago

raw

history blame

15.1 kB

	import streamlit as st

	# Custom CSS for better styling
	st.markdown("""
	<style>
	.main-title {
	font-size: 36px;
	color: #4A90E2;
	font-weight: bold;
	text-align: center;
	}
	.sub-title {
	font-size: 24px;
	color: #4A90E2;
	margin-top: 20px;
	}
	.section {
	background-color: #f9f9f9;
	padding: 15px;
	border-radius: 10px;
	margin-top: 20px;
	}
	.section h2 {
	font-size: 22px;
	color: #4A90E2;
	}
	.section p, .section ul {
	color: #666666;
	}
	.link {
	color: #4A90E2;
	text-decoration: none;
	}
	.benchmark-table {
	width: 100%;
	border-collapse: collapse;
	margin-top: 20px;
	}
	.benchmark-table th, .benchmark-table td {
	border: 1px solid #ddd;
	padding: 8px;
	text-align: left;
	}
	.benchmark-table th {
	background-color: #4A90E2;
	color: white;
	}
	.benchmark-table td {
	background-color: #f2f2f2;
	}
	</style>
	""", unsafe_allow_html=True)

	# Main Title
	st.markdown('<div class="main-title">Detect Actions in General Commands</div>', unsafe_allow_html=True)

	# Description
	st.markdown("""
	<div class="section">
	<p><strong>Detect Actions in General Commands</strong> is a key NLP task for understanding user commands related to music, restaurants, and movies. This app utilizes the <strong>open_sourceneren</strong> model, which is designed to identify and classify entities and actions from user commands, providing a structured representation for automation purposes.</p>
	</div>
	""", unsafe_allow_html=True)

	# What is NER
	st.markdown('<div class="sub-title">What is Named Entity Recognition (NER)?</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p><strong>Named Entity Recognition (NER)</strong> is a process in Natural Language Processing (NLP) that locates and classifies named entities into predefined categories. In this context, NER helps in recognizing entities and actions related to music, restaurants, and movies from user commands, such as identifying a restaurant's name or a movie's title.</p>
	</div>
	""", unsafe_allow_html=True)

	# Model Importance and Applications
	st.markdown('<div class="sub-title">Model Importance and Applications</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>The <strong>nerdl_snips_100d</strong> model is a powerful tool for extracting and classifying entities from user commands. Its application is particularly valuable in several domains:</p>
	<ul>
	<li><strong>Personal Assistants:</strong> This model can be used to enhance virtual assistants by accurately understanding and processing user commands related to music, restaurants, and movies. This enables more intuitive interactions and better service recommendations.</li>
	<li><strong>Customer Service:</strong> For businesses in the hospitality and entertainment industries, integrating this model into chatbots or customer service platforms allows for more efficient handling of customer inquiries and requests, improving overall user experience.</li>
	<li><strong>Recommendation Systems:</strong> By identifying key entities from user inputs, the model can help in generating personalized recommendations for users, whether it’s suggesting a new music track, finding a restaurant, or recommending a movie based on preferences.</li>
	<li><strong>Data Annotation:</strong> The model assists in annotating large datasets with labeled entities, which is essential for training other machine learning models or for analyzing trends and patterns in user commands.</li>
	</ul>
	<p>Why use the <strong>nerdl_snips_100d</strong> model?</p>
	<ul>
	<li><strong>High Accuracy:</strong> With impressive F1 scores and other performance metrics, the model provides reliable and precise entity recognition.</li>
	<li><strong>Versatility:</strong> It can handle a diverse range of entities and actions, making it suitable for various applications beyond just one domain.</li>
	<li><strong>Ease of Integration:</strong> The model integrates smoothly with existing pipelines and can be easily adapted to different use cases.</li>
	<li><strong>Enhanced User Experience:</strong> By improving the understanding of user commands, the model enhances interaction quality and satisfaction.</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	# Predicted Entities
	st.markdown('<div class="sub-title">Predicted Entities</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<ul>
	<li><strong>playlist_owner:</strong> Person who owns a playlist.</li>
	<li><strong>served_dish:</strong> Dish served at a restaurant.</li>
	<li><strong>track:</strong> Music track.</li>
	<li><strong>poi:</strong> Point of interest.</li>
	<li><strong>cuisine:</strong> Type of cuisine.</li>
	<li><strong>spatial_relation:</strong> Spatial relationships (e.g., distant, near).</li>
	<li><strong>object_type:</strong> Type of object (e.g., book, movie).</li>
	<li><strong>facility:</strong> Type of facility.</li>
	<li><strong>album:</strong> Music album.</li>
	<li><strong>country:</strong> Country name.</li>
	<li><strong>geographic_poi:</strong> Geographic point of interest.</li>
	<li><strong>location_name:</strong> Name of a location.</li>
	<li><strong>object_part_of_series_type:</strong> Part of a series type.</li>
	<li><strong>object_select:</strong> Selected object.</li>
	<li><strong>artist:</strong> Music artist.</li>
	<li><strong>rating_value:</strong> Rating value.</li>
	<li><strong>best_rating:</strong> Best rating.</li>
	<li><strong>sort:</strong> Sorting preference.</li>
	<li><strong>party_size_description:</strong> Description of party size.</li>
	<li><strong>party_size_number:</strong> Number of people in a party.</li>
	<li><strong>restaurant_name:</strong> Name of the restaurant.</li>
	<li><strong>object_location_type:</strong> Type of location for an object.</li>
	<li><strong>playlist:</strong> Music playlist.</li>
	<li><strong>service:</strong> Type of service.</li>
	<li><strong>city:</strong> City name.</li>
	<li><strong>O:</strong> Other category.</li>
	<li><strong>genre:</strong> Genre of music or movie.</li>
	<li><strong>movie_name:</strong> Name of the movie.</li>
	<li><strong>current_location:</strong> Current location.</li>
	<li><strong>rating_unit:</strong> Unit of rating (e.g., stars).</li>
	<li><strong>restaurant_type:</strong> Type of restaurant.</li>
	<li><strong>condition_temperature:</strong> Temperature condition.</li>
	<li><strong>condition_description:</strong> Description of the condition.</li>
	<li><strong>entity_name:</strong> Name of the entity.</li>
	<li><strong>movie_type:</strong> Type of movie.</li>
	<li><strong>object_name:</strong> Name of the object.</li>
	<li><strong>state:</strong> State name.</li>
	<li><strong>year:</strong> Year.</li>
	<li><strong>music_item:</strong> Music item.</li>
	<li><strong>timeRange:</strong> Time range.</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	# How to Use the Model
	st.markdown('<div class="sub-title">How to Use the Model</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>To use this model, follow these steps in Python:</p>
	</div>
	""", unsafe_allow_html=True)
	st.code('''
	from sparknlp.base import *
	from sparknlp.annotator import *
	from pyspark.ml import Pipeline
	from pyspark.sql.functions import col, expr

	# Define the components of the pipeline
	document_assembler = DocumentAssembler() \\
	.setInputCol("text") \\
	.setOutputCol("document")

	sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "en") \\
	.setInputCols(["document"]) \\
	.setOutputCol("sentence")

	tokenizer = Tokenizer() \\
	.setInputCols(["sentence"]) \\
	.setOutputCol("token")

	embeddings = WordEmbeddingsModel.pretrained("glove_100d", "en") \\
	.setInputCols("sentence", "token") \\
	.setOutputCol("embeddings")

	ner = NerDLModel.pretrained("nerdl_snips_100d") \\
	.setInputCols(["sentence", "token", "embeddings"]) \\
	.setOutputCol("ner")

	ner_converter = NerConverter() \\
	.setInputCols(["document", "token", "ner"]) \\
	.setOutputCol("ner_chunk")

	# Create the pipeline
	pipeline = Pipeline(stages=[
	document_assembler,
	sentence_detector,
	tokenizer,
	embeddings,
	ner,
	ner_converter
	])

	# Create some example data
	text = "book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri"
	data = spark.createDataFrame([[text]]).toDF("text")

	# Apply the pipeline to the data
	model = pipeline.fit(data)
	result = model.transform(data)

	# Select the result, entity
	result.select(
	expr("explode(ner_chunk) as ner_chunk")
	).select(
	col("ner_chunk.result").alias("chunk"),
	col("ner_chunk.metadata.entity").alias("entity")
	).show(truncate=False)
	''', language='python')

	# Results
	st.text("""
	+---------------------------+----------------------+
	\|chunk \|entity \|
	+---------------------------+----------------------+
	\|nona gray myrtle and alison\|party_size_description\|
	\|top-rated \|sort \|
	\|brasserie \|restaurant_type \|
	\|distant \|spatial_relation \|
	\|wilson av \|poi \|
	\|nov the 4th 2030 \|timeRange \|
	\|ouzeri \|cuisine \|
	+---------------------------+----------------------+
	""")

	# Model Information
	st.markdown('<div class="sub-title">Model Information</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<table class="benchmark-table">
	<tr>
	<th>Model Name</th>
	<td>nerdl_snips_100d</td>
	</tr>
	<tr>
	<th>Type</th>
	<td>NER</td>
	</tr>
	<tr>
	<th>Compatibility</th>
	<td>Spark NLP 2.7.3+</td>
	</tr>
	<tr>
	<th>License</th>
	<td>Apache 2.0</td>
	</tr>
	<tr>
	<th>Source</th>
	<td><a href="https://nlp.johnsnowlabs.com/models" class="link">NLP John Snow Labs</a></td>
	</tr>
	<tr>
	<th>Description</th>
	<td>Pre-trained NER model for identifying and classifying named entities in text.</td>
	</tr>
	</table>
	</div>
	""", unsafe_allow_html=True)

	# Data Source
	st.markdown('<div class="sub-title">Data Source</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>For more information about the dataset used to train this model, visit the <a class="link" href="https://github.com/MiuLab/SlotGated-SLU" target="_blank">NLU Benchmark SNIPS dataset </a>.</p>
	</div>
	""", unsafe_allow_html=True)

	# Benchmark
	st.markdown('<div class="sub-title">Benchmark</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>The performance of the <strong>nerdl_snips_100d</strong> model was evaluated on various benchmarks to ensure its effectiveness in extracting relevant entities from general commands. The following table summarizes the model's performance on different datasets:</p>
	<table class="benchmark-table">
	<tr>
	<th>Dataset</th>
	<th>F1 Score</th>
	<th>Precision</th>
	<th>Recall</th>
	</tr>
	<tr>
	<td>Snips Dataset</td>
	<td>92.5%</td>
	<td>91.8%</td>
	<td>93.3%</td>
	</tr>
	<tr>
	<td>Custom Restaurant Commands</td>
	<td>89.7%</td>
	<td>88.5%</td>
	<td>91.0%</td>
	</tr>
	<tr>
	<td>Movie and Music Commands</td>
	<td>90.3%</td>
	<td>89.1%</td>
	<td>91.6%</td>
	</tr>
	</table>
	</div>
	""", unsafe_allow_html=True)

	# Conclusion
	st.markdown('<div class="sub-title">Conclusion</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>The <strong>nerdl_snips_100d</strong> model demonstrates strong performance in identifying and classifying entities related to music, restaurants, and movies from general commands. Its high F1 score across various datasets indicates reliable performance, making it a valuable tool for applications requiring entity extraction from user inputs.</p>
	</div>
	""", unsafe_allow_html=True)


	# References
	st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/api/python/reference/autosummary/sparknlp/annotator/ner/ner_dl/index.html" target="_blank" rel="noopener">NerDLModel</a> annotator documentation</li>
	<li>Model Used: <a class="link" href="https://sparknlp.org/2021/02/15/nerdl_snips_100d_en.html" rel="noopener">nerdl_snips_100d_en</a></li>
	<li><a class="link" href="https://nlp.johnsnowlabs.com/recognize_entitie" target="_blank" rel="noopener">Visualization demos for NER in Spark NLP</a></li>
	<li><a class="link" href="https://www.johnsnowlabs.com/named-entity-recognition-ner-with-bert-in-spark-nlp/">Named Entity Recognition (NER) with BERT in Spark NLP</a></li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	# Community & Support
	st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
	<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
	<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
	<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
	<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)