File size: 15,134 Bytes
f996927 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 |
import streamlit as st
# Custom CSS for better styling
st.markdown("""
<style>
.main-title {
font-size: 36px;
color: #4A90E2;
font-weight: bold;
text-align: center;
}
.sub-title {
font-size: 24px;
color: #4A90E2;
margin-top: 20px;
}
.section {
background-color: #f9f9f9;
padding: 15px;
border-radius: 10px;
margin-top: 20px;
}
.section h2 {
font-size: 22px;
color: #4A90E2;
}
.section p, .section ul {
color: #666666;
}
.link {
color: #4A90E2;
text-decoration: none;
}
.benchmark-table {
width: 100%;
border-collapse: collapse;
margin-top: 20px;
}
.benchmark-table th, .benchmark-table td {
border: 1px solid #ddd;
padding: 8px;
text-align: left;
}
.benchmark-table th {
background-color: #4A90E2;
color: white;
}
.benchmark-table td {
background-color: #f2f2f2;
}
</style>
""", unsafe_allow_html=True)
# Main Title
st.markdown('<div class="main-title">Detect Actions in General Commands</div>', unsafe_allow_html=True)
# Description
st.markdown("""
<div class="section">
<p><strong>Detect Actions in General Commands</strong> is a key NLP task for understanding user commands related to music, restaurants, and movies. This app utilizes the <strong>open_sourceneren</strong> model, which is designed to identify and classify entities and actions from user commands, providing a structured representation for automation purposes.</p>
</div>
""", unsafe_allow_html=True)
# What is NER
st.markdown('<div class="sub-title">What is Named Entity Recognition (NER)?</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p><strong>Named Entity Recognition (NER)</strong> is a process in Natural Language Processing (NLP) that locates and classifies named entities into predefined categories. In this context, NER helps in recognizing entities and actions related to music, restaurants, and movies from user commands, such as identifying a restaurant's name or a movie's title.</p>
</div>
""", unsafe_allow_html=True)
# Model Importance and Applications
st.markdown('<div class="sub-title">Model Importance and Applications</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The <strong>nerdl_snips_100d</strong> model is a powerful tool for extracting and classifying entities from user commands. Its application is particularly valuable in several domains:</p>
<ul>
<li><strong>Personal Assistants:</strong> This model can be used to enhance virtual assistants by accurately understanding and processing user commands related to music, restaurants, and movies. This enables more intuitive interactions and better service recommendations.</li>
<li><strong>Customer Service:</strong> For businesses in the hospitality and entertainment industries, integrating this model into chatbots or customer service platforms allows for more efficient handling of customer inquiries and requests, improving overall user experience.</li>
<li><strong>Recommendation Systems:</strong> By identifying key entities from user inputs, the model can help in generating personalized recommendations for users, whether it’s suggesting a new music track, finding a restaurant, or recommending a movie based on preferences.</li>
<li><strong>Data Annotation:</strong> The model assists in annotating large datasets with labeled entities, which is essential for training other machine learning models or for analyzing trends and patterns in user commands.</li>
</ul>
<p>Why use the <strong>nerdl_snips_100d</strong> model?</p>
<ul>
<li><strong>High Accuracy:</strong> With impressive F1 scores and other performance metrics, the model provides reliable and precise entity recognition.</li>
<li><strong>Versatility:</strong> It can handle a diverse range of entities and actions, making it suitable for various applications beyond just one domain.</li>
<li><strong>Ease of Integration:</strong> The model integrates smoothly with existing pipelines and can be easily adapted to different use cases.</li>
<li><strong>Enhanced User Experience:</strong> By improving the understanding of user commands, the model enhances interaction quality and satisfaction.</li>
</ul>
</div>
""", unsafe_allow_html=True)
# Predicted Entities
st.markdown('<div class="sub-title">Predicted Entities</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><strong>playlist_owner:</strong> Person who owns a playlist.</li>
<li><strong>served_dish:</strong> Dish served at a restaurant.</li>
<li><strong>track:</strong> Music track.</li>
<li><strong>poi:</strong> Point of interest.</li>
<li><strong>cuisine:</strong> Type of cuisine.</li>
<li><strong>spatial_relation:</strong> Spatial relationships (e.g., distant, near).</li>
<li><strong>object_type:</strong> Type of object (e.g., book, movie).</li>
<li><strong>facility:</strong> Type of facility.</li>
<li><strong>album:</strong> Music album.</li>
<li><strong>country:</strong> Country name.</li>
<li><strong>geographic_poi:</strong> Geographic point of interest.</li>
<li><strong>location_name:</strong> Name of a location.</li>
<li><strong>object_part_of_series_type:</strong> Part of a series type.</li>
<li><strong>object_select:</strong> Selected object.</li>
<li><strong>artist:</strong> Music artist.</li>
<li><strong>rating_value:</strong> Rating value.</li>
<li><strong>best_rating:</strong> Best rating.</li>
<li><strong>sort:</strong> Sorting preference.</li>
<li><strong>party_size_description:</strong> Description of party size.</li>
<li><strong>party_size_number:</strong> Number of people in a party.</li>
<li><strong>restaurant_name:</strong> Name of the restaurant.</li>
<li><strong>object_location_type:</strong> Type of location for an object.</li>
<li><strong>playlist:</strong> Music playlist.</li>
<li><strong>service:</strong> Type of service.</li>
<li><strong>city:</strong> City name.</li>
<li><strong>O:</strong> Other category.</li>
<li><strong>genre:</strong> Genre of music or movie.</li>
<li><strong>movie_name:</strong> Name of the movie.</li>
<li><strong>current_location:</strong> Current location.</li>
<li><strong>rating_unit:</strong> Unit of rating (e.g., stars).</li>
<li><strong>restaurant_type:</strong> Type of restaurant.</li>
<li><strong>condition_temperature:</strong> Temperature condition.</li>
<li><strong>condition_description:</strong> Description of the condition.</li>
<li><strong>entity_name:</strong> Name of the entity.</li>
<li><strong>movie_type:</strong> Type of movie.</li>
<li><strong>object_name:</strong> Name of the object.</li>
<li><strong>state:</strong> State name.</li>
<li><strong>year:</strong> Year.</li>
<li><strong>music_item:</strong> Music item.</li>
<li><strong>timeRange:</strong> Time range.</li>
</ul>
</div>
""", unsafe_allow_html=True)
# How to Use the Model
st.markdown('<div class="sub-title">How to Use the Model</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>To use this model, follow these steps in Python:</p>
</div>
""", unsafe_allow_html=True)
st.code('''
from sparknlp.base import *
from sparknlp.annotator import *
from pyspark.ml import Pipeline
from pyspark.sql.functions import col, expr
# Define the components of the pipeline
document_assembler = DocumentAssembler() \\
.setInputCol("text") \\
.setOutputCol("document")
sentence_detector = SentenceDetectorDLModel.pretrained("sentence_detector_dl", "en") \\
.setInputCols(["document"]) \\
.setOutputCol("sentence")
tokenizer = Tokenizer() \\
.setInputCols(["sentence"]) \\
.setOutputCol("token")
embeddings = WordEmbeddingsModel.pretrained("glove_100d", "en") \\
.setInputCols("sentence", "token") \\
.setOutputCol("embeddings")
ner = NerDLModel.pretrained("nerdl_snips_100d") \\
.setInputCols(["sentence", "token", "embeddings"]) \\
.setOutputCol("ner")
ner_converter = NerConverter() \\
.setInputCols(["document", "token", "ner"]) \\
.setOutputCol("ner_chunk")
# Create the pipeline
pipeline = Pipeline(stages=[
document_assembler,
sentence_detector,
tokenizer,
embeddings,
ner,
ner_converter
])
# Create some example data
text = "book a spot for nona gray myrtle and alison at a top-rated brasserie that is distant from wilson av on nov the 4th 2030 that serves ouzeri"
data = spark.createDataFrame([[text]]).toDF("text")
# Apply the pipeline to the data
model = pipeline.fit(data)
result = model.transform(data)
# Select the result, entity
result.select(
expr("explode(ner_chunk) as ner_chunk")
).select(
col("ner_chunk.result").alias("chunk"),
col("ner_chunk.metadata.entity").alias("entity")
).show(truncate=False)
''', language='python')
# Results
st.text("""
+---------------------------+----------------------+
|chunk |entity |
+---------------------------+----------------------+
|nona gray myrtle and alison|party_size_description|
|top-rated |sort |
|brasserie |restaurant_type |
|distant |spatial_relation |
|wilson av |poi |
|nov the 4th 2030 |timeRange |
|ouzeri |cuisine |
+---------------------------+----------------------+
""")
# Model Information
st.markdown('<div class="sub-title">Model Information</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<table class="benchmark-table">
<tr>
<th>Model Name</th>
<td>nerdl_snips_100d</td>
</tr>
<tr>
<th>Type</th>
<td>NER</td>
</tr>
<tr>
<th>Compatibility</th>
<td>Spark NLP 2.7.3+</td>
</tr>
<tr>
<th>License</th>
<td>Apache 2.0</td>
</tr>
<tr>
<th>Source</th>
<td><a href="https://nlp.johnsnowlabs.com/models" class="link">NLP John Snow Labs</a></td>
</tr>
<tr>
<th>Description</th>
<td>Pre-trained NER model for identifying and classifying named entities in text.</td>
</tr>
</table>
</div>
""", unsafe_allow_html=True)
# Data Source
st.markdown('<div class="sub-title">Data Source</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>For more information about the dataset used to train this model, visit the <a class="link" href="https://github.com/MiuLab/SlotGated-SLU" target="_blank">NLU Benchmark SNIPS dataset </a>.</p>
</div>
""", unsafe_allow_html=True)
# Benchmark
st.markdown('<div class="sub-title">Benchmark</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The performance of the <strong>nerdl_snips_100d</strong> model was evaluated on various benchmarks to ensure its effectiveness in extracting relevant entities from general commands. The following table summarizes the model's performance on different datasets:</p>
<table class="benchmark-table">
<tr>
<th>Dataset</th>
<th>F1 Score</th>
<th>Precision</th>
<th>Recall</th>
</tr>
<tr>
<td>Snips Dataset</td>
<td>92.5%</td>
<td>91.8%</td>
<td>93.3%</td>
</tr>
<tr>
<td>Custom Restaurant Commands</td>
<td>89.7%</td>
<td>88.5%</td>
<td>91.0%</td>
</tr>
<tr>
<td>Movie and Music Commands</td>
<td>90.3%</td>
<td>89.1%</td>
<td>91.6%</td>
</tr>
</table>
</div>
""", unsafe_allow_html=True)
# Conclusion
st.markdown('<div class="sub-title">Conclusion</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<p>The <strong>nerdl_snips_100d</strong> model demonstrates strong performance in identifying and classifying entities related to music, restaurants, and movies from general commands. Its high F1 score across various datasets indicates reliable performance, making it a valuable tool for applications requiring entity extraction from user inputs.</p>
</div>
""", unsafe_allow_html=True)
# References
st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/api/python/reference/autosummary/sparknlp/annotator/ner/ner_dl/index.html" target="_blank" rel="noopener">NerDLModel</a> annotator documentation</li>
<li>Model Used: <a class="link" href="https://sparknlp.org/2021/02/15/nerdl_snips_100d_en.html" rel="noopener">nerdl_snips_100d_en</a></li>
<li><a class="link" href="https://nlp.johnsnowlabs.com/recognize_entitie" target="_blank" rel="noopener">Visualization demos for NER in Spark NLP</a></li>
<li><a class="link" href="https://www.johnsnowlabs.com/named-entity-recognition-ner-with-bert-in-spark-nlp/">Named Entity Recognition (NER) with BERT in Spark NLP</a></li>
</ul>
</div>
""", unsafe_allow_html=True)
# Community & Support
st.markdown('<div class="sub-title">Community & Support</div>', unsafe_allow_html=True)
st.markdown("""
<div class="section">
<ul>
<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
</ul>
</div>
""", unsafe_allow_html=True) |