|
--- |
|
language: en |
|
tags: |
|
- chatbot |
|
- natural language processing |
|
license: Apache 2.0 |
|
datasets: |
|
- Custom Dataset (Dronealexa) |
|
--- |
|
|
|
Model Card: NLP-Based Chatbot |
|
|
|
Overview |
|
|
|
The NLP-Based Chatbot is designed to explore Science & Technology topics. It utilizes a combination of semantic search and summarization techniques to provide relevant and concise responses to user queries. |
|
|
|
Model Details |
|
|
|
- **Model Name:** NLP-Based Chatbot |
|
- **Model Type:** Natural Language Processing (NLP) Chatbot |
|
- **Framework:** Gradio Blocks Interface, spaCy, Transformers |
|
|
|
Components |
|
|
|
1. Semantic Search |
|
|
|
The chatbot employs semantic search to retrieve relevant information from a preprocessed dataset (Dronealexa.csv). The search is based on a TF-IDF vectorizer and cosine similarity calculations. |
|
|
|
2. Summarization |
|
|
|
A summarization pipeline is used to generate concise summaries of the retrieved information. The Hugging Face Transformers library is utilized for summarization tasks. |
|
|
|
3. Custom Embeddings |
|
|
|
The model incorporates custom text embeddings using spaCy and pre-trained word embeddings. These embeddings enhance the understanding of user queries and contribute to the semantic search. |
|
|
|
4. Gradio Blocks Interface |
|
|
|
The chatbot's frontend is built using Gradio Blocks Interface, providing an interactive and user-friendly platform for users to input queries and receive responses. |
|
|
|
5. Model Card Generation |
|
|
|
The model card generation involves constructing prompts based on search results and utilizing a summarization pipeline to produce model card content. |
|
|
|
Intended Use |
|
|
|
The NLP-Based Chatbot is intended for users interested in exploring Science & Technology topics. It can be used to obtain information from the provided dataset, and users are encouraged to provide feedback for continuous improvement. |
|
|
|
Training Data |
|
|
|
The model is trained on a custom dataset (Dronealexa.csv) containing Science & Technology-related information. The dataset has been preprocessed to handle missing values and ensure efficient semantic search. |
|
|
|
|
|
Evaluation Metrics |
|
|
|
- Semantic Search: TF-IDF Vectorizer, Cosine Similarity |
|
- Summarization: Hugging Face Transformers Pipeline |
|
|
|
|
|
Ethical Considerations |
|
|
|
The chatbot aims to provide accurate and relevant information. However, users are advised to critically evaluate the responses and understand that the model's knowledge is based on the training data. |
|
|
|
|
|
Usage Instructions |
|
|
|
1. Input your query in the provided textbox. |
|
2. Click the "Send" button to receive a response. |
|
3. Optionally, submit feedback using the "Submit Feedback" button. |
|
|
|
|
|
License |
|
|
|
This model is released under the Apache 2.0 License. |
|
|
|
|
|
Contact Information |
|
|
|
For inquiries or issues, please contact [email protected]. |
|
|
|
|