Spaces:

nikhiljais
/

Phi2-QLoRa-OASST

Runtime error

App Files Files Community

nikhiljais commited on Mar 7

Commit

ea6cbcb

verified ·

1 Parent(s): bd1714b

Upload folder using huggingface_hub

Browse files

Files changed (3) hide show

README.md +35 -12
app.py +103 -0
requirements.txt +6 -0

README.md CHANGED Viewed

@@ -1,12 +1,35 @@
----
-title: Phi2 QLoRa OASST
-emoji: 🌍
-colorFrom: pink
-colorTo: blue
-sdk: gradio
-sdk_version: 5.20.1
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# Phi-2 Fine-tuned Chat Assistant
+This Space hosts a fine-tuned version of Microsoft's Phi-2 model using QLoRA (Quantized Low-Rank Adaptation). The model has been trained on the OpenAssistant dataset to improve its conversational abilities.
+## Model Details
+- Base Model: Microsoft Phi-2
+- Training Method: QLoRA (4-bit quantization)
+- Dataset: OpenAssistant Conversations Dataset
+- Fine-tuning Parameters:
+  - LoRA rank: 16
+  - LoRA alpha: 32
+  - Dropout: 0.1
+  - Target modules: q_proj, v_proj
+## Usage
+Simply type your message in the input box and press Enter. The model will generate a response based on your input. You can also try the example prompts provided below the chat interface.
+## Features
+- Interactive chat interface
+- Real-time response generation
+- Example prompts for quick testing
+- Configurable generation parameters (temperature, top-p)
+## Limitations
+- The model may occasionally generate incorrect or inconsistent responses
+- Response generation time may vary depending on the input length and server load
+- The model's knowledge is limited to its training data
+## License
+This Space uses the Microsoft Phi-2 model which is subject to its original license. The fine-tuning additions are provided under [Your License].

app.py ADDED Viewed

	@@ -0,0 +1,103 @@

+import gradio as gr
+from transformers import AutoModelForCausalLM, AutoTokenizer
+from peft import PeftModel
+import torch
+# Model configuration
+MODEL_PATH = "YOUR_HF_USERNAME/YOUR_MODEL_NAME"  # Replace with your model path
+BASE_MODEL = "microsoft/phi-2"
+class Phi2Chat:
+    def __init__(self):
+        print("Loading tokenizer...")
+        self.tokenizer = AutoTokenizer.from_pretrained(MODEL_PATH)
+        print("Loading base model...")
+        base_model = AutoModelForCausalLM.from_pretrained(
+            BASE_MODEL,
+            device_map="auto",
+            torch_dtype=torch.float16
+        )
+        print("Loading fine-tuned model...")
+        self.model = PeftModel.from_pretrained(base_model, MODEL_PATH)
+        self.model.eval()
+        self.chat_template = """<|im_start|>user
+{prompt}\n<|im_end|>
+<|im_start|>assistant
+"""
+    def generate_response(
+        self,
+        prompt: str,
+        max_new_tokens: int = 300,
+        temperature: float = 0.7,
+        top_p: float = 0.9
+    ) -> str:
+        formatted_prompt = self.chat_template.format(prompt=prompt)
+        inputs = self.tokenizer(formatted_prompt, return_tensors="pt").to(self.model.device)
+        with torch.no_grad():
+            output = self.model.generate(
+                **inputs,
+                max_new_tokens=max_new_tokens,
+                temperature=temperature,
+                top_p=top_p,
+                do_sample=True
+            )
+        response = self.tokenizer.decode(output[0], skip_special_tokens=True)
+        # Extract only the assistant's response
+        try:
+            response = response.split("<|im_start|>assistant\n")[-1].split("<|im_end|>")[0].strip()
+        except:
+            response = response.split(prompt)[-1].strip()
+        return response
+# Initialize model
+phi2_chat = Phi2Chat()
+def chat_response(message, history):
+    response = phi2_chat.generate_response(message)
+    return response
+# Create Gradio interface
+css = """
+.gradio-container {
+    font-family: 'IBM Plex Sans', sans-serif;
+}
+.chat-message {
+    padding: 1rem;
+    border-radius: 0.5rem;
+    margin-bottom: 1rem;
+    background: #f7f7f7;
+}
+"""
+with gr.Blocks(css=css) as demo:
+    gr.Markdown("# Phi-2 Fine-tuned Chat Assistant")
+    gr.Markdown("""
+    This is a fine-tuned version of Microsoft's Phi-2 model using QLoRA.
+    The model has been trained on the OpenAssistant dataset to improve its conversational abilities.
+    """)
+    chatbot = gr.ChatInterface(
+        chat_response,
+        chatbot=gr.Chatbot(height=400),
+        textbox=gr.Textbox(placeholder="Type your message here...", container=False, scale=7),
+        title="Chat with Phi-2",
+        description="Have a conversation with the fine-tuned Phi-2 model",
+        theme="soft",
+        examples=[
+            "What is quantum computing?",
+            "Write a Python function to find prime numbers",
+            "Explain the concept of machine learning in simple terms"
+        ],
+        retry_btn="Retry",
+        undo_btn="Undo",
+        clear_btn="Clear",
+    )
+demo.launch()

requirements.txt ADDED Viewed

	@@ -0,0 +1,6 @@

+transformers>=4.36.0
+torch>=2.0.0
+peft>=0.7.0
+accelerate>=0.25.0
+bitsandbytes>=0.41.0
+gradio>=4.0.0