import streamlit as st
# Custom CSS for better styling
st.markdown("""
""", unsafe_allow_html=True)
# Main Title
st.markdown('
Detect Actions in General Commands
', unsafe_allow_html=True)
# Description
st.markdown("""
Detect Actions in General Commands is a key NLP task for understanding user commands related to music, restaurants, and movies. This app utilizes the open_sourceneren model, which is designed to identify and classify entities and actions from user commands, providing a structured representation for automation purposes.
""", unsafe_allow_html=True)
# What is NER
st.markdown('What is Named Entity Recognition (NER)?
', unsafe_allow_html=True)
st.markdown("""
Named Entity Recognition (NER) is a process in Natural Language Processing (NLP) that locates and classifies named entities into predefined categories. In this context, NER helps in recognizing entities and actions related to music, restaurants, and movies from user commands, such as identifying a restaurant's name or a movie's title.
""", unsafe_allow_html=True)
# Model Importance and Applications
st.markdown('Model Importance and Applications
', unsafe_allow_html=True)
st.markdown("""
The nerdl_snips_100d model is a powerful tool for extracting and classifying entities from user commands. Its application is particularly valuable in several domains:
- Personal Assistants: This model can be used to enhance virtual assistants by accurately understanding and processing user commands related to music, restaurants, and movies. This enables more intuitive interactions and better service recommendations.
- Customer Service: For businesses in the hospitality and entertainment industries, integrating this model into chatbots or customer service platforms allows for more efficient handling of customer inquiries and requests, improving overall user experience.
- Recommendation Systems: By identifying key entities from user inputs, the model can help in generating personalized recommendations for users, whether it’s suggesting a new music track, finding a restaurant, or recommending a movie based on preferences.
- Data Annotation: The model assists in annotating large datasets with labeled entities, which is essential for training other machine learning models or for analyzing trends and patterns in user commands.
Why use the nerdl_snips_100d model?
- High Accuracy: With impressive F1 scores and other performance metrics, the model provides reliable and precise entity recognition.
- Versatility: It can handle a diverse range of entities and actions, making it suitable for various applications beyond just one domain.
- Ease of Integration: The model integrates smoothly with existing pipelines and can be easily adapted to different use cases.
- Enhanced User Experience: By improving the understanding of user commands, the model enhances interaction quality and satisfaction.
""", unsafe_allow_html=True)
# Predicted Entities
st.markdown('Predicted Entities
', unsafe_allow_html=True)
st.markdown("""
- playlist_owner: Person who owns a playlist.
- served_dish: Dish served at a restaurant.
- track: Music track.
- poi: Point of interest.
- cuisine: Type of cuisine.
- spatial_relation: Spatial relationships (e.g., distant, near).
- object_type: Type of object (e.g., book, movie).
- facility: Type of facility.
- album: Music album.
- country: Country name.
- geographic_poi: Geographic point of interest.
- location_name: Name of a location.
- object_part_of_series_type: Part of a series type.
- object_select: Selected object.
- artist: Music artist.
- rating_value: Rating value.
- best_rating: Best rating.
- sort: Sorting preference.
- party_size_description: Description of party size.
- party_size_number: Number of people in a party.
- restaurant_name: Name of the restaurant.
- object_location_type: Type of location for an object.
- playlist: Music playlist.
- service: Type of service.
- city: City name.
- O: Other category.
- genre: Genre of music or movie.
- movie_name: Name of the movie.
- current_location: Current location.
- rating_unit: Unit of rating (e.g., stars).
- restaurant_type: Type of restaurant.
- condition_temperature: Temperature condition.
- condition_description: Description of the condition.
- entity_name: Name of the entity.
- movie_type: Type of movie.
- object_name: Name of the object.
- state: State name.
- year: Year.
- music_item: Music item.
- timeRange: Time range.
""", unsafe_allow_html=True)
# How to Use the Model
st.markdown('How to Use the Model
', unsafe_allow_html=True)
st.markdown("""
To use this model, follow these steps in Python:
""", unsafe_allow_html=True)
st.code('''
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline
from pyspark.sql.functions import col, expr
# Define the components of the pipeline
document_assembler = DocumentAssembler() \\
.setInputCol("text") \\
.setOutputCol("document")
sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "en") \\
.setInputCols(["document"]) \\
.setOutputCol("sentence")
tokenizer = Tokenizer() \\
.setInputCols(["sentence"]) \\
.setOutputCol("token")
embeddings = WordEmbeddingsModel.pretrained("glove_100d", "en") \\
.setInputCols("sentence", "token") \\
.setOutputCol("embeddings")
ner = NerDLModel.pretrained("nerdl_snips_100d") \\
.setInputCols(["sentence", "token", "embeddings"]) \\
.setOutputCol("ner")
ner_converter = NerConverter() \\
.setInputCols(["document", "token", "ner"]) \\
.setOutputCol("ner_chunk")
# Create the pipeline
pipeline = Pipeline(stages=[
document_assembler,
sentence_detector,
tokenizer,
embeddings,
ner,
ner_converter
])
# Create some example data
text = "book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri"
data = spark.createDataFrame([[text]]).toDF("text")
# Apply the pipeline to the data
model = pipeline.fit(data)
result = model.transform(data)
# Select the result, entity
result.select(
expr("explode(ner_chunk) as ner_chunk")
).select(
col("ner_chunk.result").alias("chunk"),
col("ner_chunk.metadata.entity").alias("entity")
).show(truncate=False)
''', language='python')
# Results
st.text("""
+---------------------------+----------------------+
|chunk |entity |
+---------------------------+----------------------+
|nona gray myrtle and alison|party_size_description|
|top-rated |sort |
|brasserie |restaurant_type |
|distant |spatial_relation |
|wilson av |poi |
|nov the 4th 2030 |timeRange |
|ouzeri |cuisine |
+---------------------------+----------------------+
""")
# Model Information
st.markdown('Model Information
', unsafe_allow_html=True)
st.markdown("""
Model Name |
nerdl_snips_100d |
Type |
NER |
Compatibility |
Spark NLP 2.7.3+ |
License |
Apache 2.0 |
Source |
NLP John Snow Labs |
Description |
Pre-trained NER model for identifying and classifying named entities in text. |
""", unsafe_allow_html=True)
# Data Source
st.markdown('Data Source
', unsafe_allow_html=True)
st.markdown("""
""", unsafe_allow_html=True)
# Benchmark
st.markdown('Benchmark
', unsafe_allow_html=True)
st.markdown("""
The performance of the nerdl_snips_100d model was evaluated on various benchmarks to ensure its effectiveness in extracting relevant entities from general commands. The following table summarizes the model's performance on different datasets:
Dataset |
F1 Score |
Precision |
Recall |
Snips Dataset |
92.5% |
91.8% |
93.3% |
Custom Restaurant Commands |
89.7% |
88.5% |
91.0% |
Movie and Music Commands |
90.3% |
89.1% |
91.6% |
""", unsafe_allow_html=True)
# Conclusion
st.markdown('Conclusion
', unsafe_allow_html=True)
st.markdown("""
The nerdl_snips_100d model demonstrates strong performance in identifying and classifying entities related to music, restaurants, and movies from general commands. Its high F1 score across various datasets indicates reliable performance, making it a valuable tool for applications requiring entity extraction from user inputs.
""", unsafe_allow_html=True)
# References
st.markdown('References
', unsafe_allow_html=True)
st.markdown("""
""", unsafe_allow_html=True)
# Community & Support
st.markdown('Community & Support
', unsafe_allow_html=True)
st.markdown("""
- Official Website: Documentation and examples
- Slack: Live discussion with the community and team
- GitHub: Bug reports, feature requests, and contributions
- Medium: Spark NLP articles
- YouTube: Video tutorials
""", unsafe_allow_html=True)