Spaces:

sabazo
/

docs-qachat-demo

Sleeping

App Files Files Community

Asaad Almutareb commited on Nov 28, 2023

Commit

ea36e00

1 Parent(s): 76d4a7e

changed proj name

Browse files

Files changed (2) hide show

README.md +14 -8
app.py +45 -12

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: Docs Qachat
 emoji: 🚀
 colorFrom: gray
 colorTo: gray
@@ -9,13 +9,13 @@ app_file: app.py
 pinned: false
 ---
-# Docs QAchat 🚀
 ## Overview
-Docs QAchat is an advanced Documentation AI helper, demonstrating a fine-tuned 7b model's capabilities in aiding users with software documentation. This application integrates technologies like Retrieval-Augmented Generation (RAG), LangChain, Gradio UI, Chroma DB, and FAISS to offer insightful documentation assistance. It's designed to help users navigate and utilize software tools efficiently by retrieving relevant documentation pages and maintaining conversational flow.
 ## Key Features
-- **AI-Powered Documentation Retrieval:** Utilizes various fine-tuned 7b models for precise and context-aware responses.
 - **Rich User Interface:** Features a user-friendly interface built with Gradio.
 - **Advanced Language Understanding:** Employs LangChain for implementing RAG setups and sophisticated natural language processing.
 - **Efficient Data Handling:** Leverages Chroma DB and FAISS for optimized data storage and retrieval.
@@ -34,7 +34,7 @@ This setup is tested with the following models:
 ## Prerequisites
 - Python 3.8 or later
-- [Additional prerequisites as needed]
 ## Installation
 1. Clone the repository:
@@ -68,14 +68,20 @@ python app.py
 [Include additional usage instructions and examples]
 ## Contributing
-Contributions to Docs QAchat are welcome. [Include contribution guidelines]
 ## Support
-For support, contact [Support Contact Information].
 ## Authors and Acknowledgement
 - [Name]
-- Acknowledgements to the contributors of the used models and technologies.
 ## License
 This project is licensed under the [License] - see the LICENSE file for details.

 ---
+title: LC Gradio DocsAI
 emoji: 🚀
 colorFrom: gray
 colorTo: gray
 pinned: false
 ---
+# LC Gradio DocsAI 🚀
 ## Overview
+LC-Gradio-DocAI is a demo project showcasing a privately hosted advanced Documentation AI helper, demonstrating a fine-tuned 7B model's capabilities in aiding users with software documentation. This application integrates technologies like Retrieval-Augmented Generation (RAG) using LangChain, a vector store using Chroma DB or and FAISS and Gradio for a model UI to offer insightful documentation assistance. It's designed to help users navigate and utilize software tools efficiently by retrieving relevant documentation pages and maintaining conversational flow.
 ## Key Features
+- **AI-Powered Documentation Retrieval:** Utilizes various fine-tuned 7B models for precise and context-aware responses.
 - **Rich User Interface:** Features a user-friendly interface built with Gradio.
 - **Advanced Language Understanding:** Employs LangChain for implementing RAG setups and sophisticated natural language processing.
 - **Efficient Data Handling:** Leverages Chroma DB and FAISS for optimized data storage and retrieval.
 ## Prerequisites
 - Python 3.8 or later
+- [Additional prerequisites...]
 ## Installation
 1. Clone the repository:
 [Include additional usage instructions and examples]
 ## Contributing
+Contributions to LC-Gradio-DocsAI are welcome. Here's how you can contribute:
+1. Fork the repository.
+2. Create a new branch (git checkout -b feature/YourFeature).
+3. Make changes and commit (git commit -m 'Add some feature').
+4. Push to the branch (git push origin feature/YourFeature).
+5. Create a new Pull Request.
 ## Support
+For support, please open an issue here on Github.
 ## Authors and Acknowledgement
 - [Name]
+- Thanks to contributors of all the awesome open-source LLMs, LangChain, HuggingFace, Chroma Vector Store, FAISS and Graido UI.
 ## License
 This project is licensed under the [License] - see the LICENSE file for details.

app.py CHANGED Viewed

@@ -1,7 +1,7 @@
 # gradio
 import gradio as gr
-import random
-import time
 #boto3 for S3 access
 import boto3
 from botocore import UNSIGNED
@@ -18,12 +18,22 @@ from langchain.vectorstores import Chroma
 from langchain.vectorstores import FAISS
 # retrieval chain
 from langchain.chains import RetrievalQA
 # prompt template
 from langchain.prompts import PromptTemplate
 from langchain.memory import ConversationBufferMemory
 # logging
-#import logging
 import zipfile
 # load .env variables
 config = load_dotenv(".env")
@@ -32,6 +42,7 @@ AWS_S3_LOCATION=os.getenv('AWS_S3_LOCATION')
 AWS_S3_FILE=os.getenv('AWS_S3_FILE')
 VS_DESTINATION=os.getenv('VS_DESTINATION')
 model_id = HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-beta", model_kwargs={
     "temperature":0.1,
     "max_new_tokens":1024,
@@ -43,7 +54,7 @@ model_id = HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-beta", model_kwargs={
 model_name = "sentence-transformers/multi-qa-mpnet-base-dot-v1"
 embeddings = HuggingFaceHubEmbeddings(repo_id=model_name)
 s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
 ## Chroma DB
@@ -60,6 +71,12 @@ db.get()
 # db = FAISS.load_local(FAISS_INDEX_PATH, embeddings)
 retriever = db.as_retriever(search_type = "mmr")#, search_kwargs={'k': 5, 'fetch_k': 25})
 global qa
 template = """
 You are the friendly documentation buddy Arti, who helps the Human in using RAY, the open-source unified framework for scaling AI and Python applications.\
@@ -81,13 +98,28 @@ prompt = PromptTemplate(
     template=template,
 )
 memory = ConversationBufferMemory(memory_key="history", input_key="question")
-qa = RetrievalQA.from_chain_type(llm=model_id, chain_type="stuff", retriever=retriever, verbose=True, return_source_documents=True, chain_type_kwargs={
     "verbose": True,
     "memory": memory,
-    "prompt": prompt
 }
     )
 def add_text(history, text):
     history = history + [(text, None)]
@@ -95,18 +127,20 @@ def add_text(history, text):
 def bot(history):
     response = infer(history[-1][0], history)
     print(*memory)
-    sources = [doc.metadata.get("source") for doc in response['source_documents']]
     src_list = '\n'.join(sources)
-    print_this = response['result']+"\n\n\n Sources: \n\n\n"+src_list
     #history[-1][1] = ""
     #for character in response['result']: #print_this:
     #    history[-1][1] += character
     #    time.sleep(0.05)
     #    yield history
-    history[-1][1] = print_this #response['result']
-    return history
 def infer(question, history):
     query =  question
@@ -137,5 +171,4 @@ with gr.Blocks(css=css) as demo:
     )
     clear.click(lambda: None, None, chatbot, queue=False)
-demo.queue()
-demo.launch()

 # gradio
 import gradio as gr
+#import random
+#import time
 #boto3 for S3 access
 import boto3
 from botocore import UNSIGNED
 from langchain.vectorstores import FAISS
 # retrieval chain
 from langchain.chains import RetrievalQA
+from langchain.chains import RetrievalQAWithSourcesChain
 # prompt template
 from langchain.prompts import PromptTemplate
 from langchain.memory import ConversationBufferMemory
 # logging
+import logging
 import zipfile
+#contextual retriever
+from langchain.retrievers import ContextualCompressionRetriever
+from langchain.retrievers.document_compressors import LLMChainExtractor
+from langchain.retrievers.document_compressors import EmbeddingsFilter
+from langchain.retrievers.multi_query import MultiQueryRetriever
+# streaming
+from threading import Thread
+from transformers import TextIteratorStreamer
 # load .env variables
 config = load_dotenv(".env")
 AWS_S3_FILE=os.getenv('AWS_S3_FILE')
 VS_DESTINATION=os.getenv('VS_DESTINATION')
+# initialize Model config
 model_id = HuggingFaceHub(repo_id="HuggingFaceH4/zephyr-7b-beta", model_kwargs={
     "temperature":0.1,
     "max_new_tokens":1024,
 model_name = "sentence-transformers/multi-qa-mpnet-base-dot-v1"
 embeddings = HuggingFaceHubEmbeddings(repo_id=model_name)
+# retrieve vectorsrore
 s3 = boto3.client('s3', config=Config(signature_version=UNSIGNED))
 ## Chroma DB
 # db = FAISS.load_local(FAISS_INDEX_PATH, embeddings)
 retriever = db.as_retriever(search_type = "mmr")#, search_kwargs={'k': 5, 'fetch_k': 25})
+compressor = LLMChainExtractor.from_llm(model_id)
+compression_retriever = ContextualCompressionRetriever(base_compressor=compressor, base_retriever=retriever)
+# embeddings_filter = EmbeddingsFilter(embeddings=embeddings, similarity_threshold=0.76)
+# compression_retriever = ContextualCompressionRetriever(base_compressor=embeddings_filter, base_retriever=retriever)
 global qa
 template = """
 You are the friendly documentation buddy Arti, who helps the Human in using RAY, the open-source unified framework for scaling AI and Python applications.\
     template=template,
 )
 memory = ConversationBufferMemory(memory_key="history", input_key="question")
+# logging for the chain
+logging.basicConfig()
+logging.getLogger("langchain.chains").setLevel(logging.INFO)
+# qa = RetrievalQA.from_chain_type(llm=model_id, chain_type="stuff", retriever=compression_retriever, verbose=True, return_source_documents=True, chain_type_kwargs={
+#     "verbose": True,
+#     "memory": memory,
+#     "prompt": prompt
+# }
+#     )
+qa = RetrievalQAWithSourcesChain.from_chain_type(llm=model_id, retriever=compression_retriever, verbose=True, chain_type_kwargs={
     "verbose": True,
     "memory": memory,
+    "prompt": prompt,
+    "document_variable_name": "context"
 }
     )
+def pretty_print_docs(docs):
+    print(f"\n{'-' * 100}\n".join([f"Document {i+1}:\n\n" + d.page_content for i, d in enumerate(docs)]))
 def add_text(history, text):
     history = history + [(text, None)]
 def bot(history):
     response = infer(history[-1][0], history)
+    print(*response)
     print(*memory)
+    sources = [doc.metadata.get("source") for doc in response['sources']]
     src_list = '\n'.join(sources)
+    print_this = response['answer'] + "\n\n\n Sources: \n\n\n" + src_list
+    #sources = f"`Sources:`\n\n' + response['sources']"
     #history[-1][1] = ""
     #for character in response['result']: #print_this:
     #    history[-1][1] += character
     #    time.sleep(0.05)
     #    yield history
+    history[-1][1] = response['answer']
+    return history #, sources
 def infer(question, history):
     query =  question
     )
     clear.click(lambda: None, None, chatbot, queue=False)
+demo.queue().launch()