Spaces:

spark-nlp
/

xlm-roberta-for-question-answering

Sleeping

App Files Files Community

xlm-roberta-for-question-answering / pages /Workflow & Model Overview.py

abdullahmubeen10

Upload 5 files

cd3baae verified 11 months ago

raw

history blame contribute delete

8.09 kB

	import streamlit as st

	# Page configuration
	st.set_page_config(
	layout="wide",
	initial_sidebar_state="auto"
	)

	# Custom CSS for better styling
	st.markdown("""
	<style>
	.main-title {
	font-size: 36px;
	color: #4A90E2;
	font-weight: bold;
	text-align: center;
	}
	.sub-title {
	font-size: 24px;
	color: #4A90E2;
	margin-top: 20px;
	}
	.section {
	background-color: #f9f9f9;
	padding: 15px;
	border-radius: 10px;
	margin-top: 20px;
	}
	.section h2 {
	font-size: 22px;
	color: #4A90E2;
	}
	.section p, .section ul {
	color: #666666;
	}
	.link {
	color: #4A90E2;
	text-decoration: none;
	}
	.benchmark-table {
	width: 100%;
	border-collapse: collapse;
	margin-top: 20px;
	}
	.benchmark-table th, .benchmark-table td {
	border: 1px solid #ddd;
	padding: 8px;
	text-align: left;
	}
	.benchmark-table th {
	background-color: #4A90E2;
	color: white;
	}
	.benchmark-table td {
	background-color: #f2f2f2;
	}
	</style>
	""", unsafe_allow_html=True)

	# Title
	st.markdown('<div class="main-title">Introduction to XLM-RoBERTa Annotators in Spark NLP</div>', unsafe_allow_html=True)

	# Subtitle
	st.markdown("""
	<div class="section">
	<p>XLM-RoBERTa (Cross-lingual Robustly Optimized BERT Approach) is an advanced multilingual model that extends the capabilities of RoBERTa to over 100 languages. Pre-trained on a massive, diverse corpus, XLM-RoBERTa is designed to handle various NLP tasks in a multilingual context, making it ideal for applications that require cross-lingual understanding. Below, we provide an overview of the XLM-RoBERTa annotators for these tasks:</p>
	</div>
	""", unsafe_allow_html=True)

	# XLM-RoBERTa for Question Answering
	st.markdown("""<div class="sub-title">Question Answering with XLM-RoBERTa</div>""", unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>Question answering (QA) is a crucial task in Natural Language Processing (NLP) where the goal is to extract an answer from a given context in response to a specific question.</p>
	<p><strong>XLM-RoBERTa</strong> excels in question answering tasks across multiple languages, making it an invaluable tool for global applications. Below is an example of how to implement question answering using XLM-RoBERTa in Spark NLP.</p>
	<p>Using XLM-RoBERTa for Question Answering enables:</p>
	<ul>
	<li><strong>Multilingual QA:</strong> Extract answers from text in various languages with a single model.</li>
	<li><strong>Accurate Contextual Understanding:</strong> Leverage XLM-RoBERTa's deep understanding of context to provide precise answers.</li>
	<li><strong>Cross-Domain Flexibility:</strong> Apply to different domains, from customer support to education, across languages.</li>
	</ul>
	<p>Advantages of using XLM-RoBERTa for Question Answering in Spark NLP include:</p>
	<ul>
	<li><strong>Scalability:</strong> Spark NLP is built on Apache Spark, ensuring efficient scaling for large datasets.</li>
	<li><strong>Pretrained Excellence:</strong> Utilize state-of-the-art pretrained models to achieve high accuracy in question answering tasks.</li>
	<li><strong>Multilingual Flexibility:</strong> XLM-RoBERTa’s multilingual capabilities make it suitable for global applications, reducing the need for language-specific models.</li>
	<li><strong>Seamless Integration:</strong> Easily incorporate XLM-RoBERTa into your existing Spark pipelines for streamlined NLP workflows.</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	st.markdown("""<div class="sub-title">How to Use XLM-RoBERTa for Question Answering in Spark NLP</div>""", unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>To leverage XLM-RoBERTa for question answering, Spark NLP provides a user-friendly pipeline setup. The following example shows how to use XLM-RoBERTa for extracting answers from a given context based on a specific question. XLM-RoBERTa’s multilingual training enables it to perform question answering across various languages, making it an essential tool for global NLP tasks.</p>
	</div>
	""", unsafe_allow_html=True)

	# Code Example
	st.code('''
	from sparknlp.base import *
	from sparknlp.annotator import *
	from pyspark.ml import Pipeline

	document_assembler = MultiDocumentAssembler() \\
	.setInputCols(["question", "context"]) \\
	.setOutputCols(["document_question", "document_context"])

	spanClassifier = XlmRoBertaForQuestionAnswering.pretrained("xlm_roberta_qa_Part_1_XLM_Model_E1","en") \\
	.setInputCols(["document_question", "document_context"]) \\
	.setOutputCol("answer") \\
	.setCaseSensitive(True)

	pipeline = Pipeline().setStages([document_assembler, spanClassifier])

	example = spark.createDataFrame([["What's my name?", "My name is Clara and I live in Berkeley."]]).toDF("question", "context")

	result = pipeline.fit(example).transform(example)
	result.select("answer.result").show(truncate=False)
	''', language='python')

	st.text("""
	+-----------+
	\| result \|
	+-----------+
	\|[Clara] \|
	+-----------+
	""")

	# Model Info Section
	st.markdown('<div class="sub-title">Choosing the Right Model</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<p>The XLM-RoBERTa model used here is pretrained and fine-tuned for question answering tasks, providing high accuracy and multilingual support.</p>
	<p>For more information about the model, visit the <a class="link" href="https://huggingface.co/xlm-roberta-base" target="_blank">XLM-RoBERTa Model Hub</a>.</p>
	</div>
	""", unsafe_allow_html=True)

	# References Section
	st.markdown('<div class="sub-title">References</div>', unsafe_allow_html=True)
	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://arxiv.org/abs/1911.02116" target="_blank">XLM-R: Cross-lingual Pre-training</a></li>
	<li><a class="link" href="https://huggingface.co/xlm-roberta-base" target="_blank">XLM-RoBERTa Model Overview</a></li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	# Footer
	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/" target="_blank">Official Website</a>: Documentation and examples</li>
	<li><a class="link" href="https://join.slack.com/t/spark-nlp/shared_invite/zt-198dipu77-L3UWNe_AJ8xqDk0ivmih5Q" target="_blank">Slack</a>: Live discussion with the community and team</li>
	<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp" target="_blank">GitHub</a>: Bug reports, feature requests, and contributions</li>
	<li><a class="link" href="https://medium.com/spark-nlp" target="_blank">Medium</a>: Spark NLP articles</li>
	<li><a class="link" href="https://www.youtube.com/channel/UCmFOjlpYEhxf_wJUDuz6xxQ/videos" target="_blank">YouTube</a>: Video tutorials</li>
	</ul>
	</div>
	""", unsafe_allow_html=True)

	st.markdown('<div class="sub-title">Quick Links</div>', unsafe_allow_html=True)

	st.markdown("""
	<div class="section">
	<ul>
	<li><a class="link" href="https://sparknlp.org/docs/en/quickstart" target="_blank">Getting Started</a></li>
	<li><a class="link" href="https://nlp.johnsnowlabs.com/models" target="_blank">Pretrained Models</a></li>
	<li><a class="link" href="https://github.com/JohnSnowLabs/spark-nlp/tree/master/examples/python/annotation/text/english" target="_blank">Example Notebooks</a></li>
	<li><a class="link" href="https://sparknlp.org/docs/en/install" target="_blank">Installation Guide</a></li>
	</ul>
	</div>
	""", unsafe_allow_html=True)