# Fine-Tuned LLM API This is a FastAPI-based API service for the fine-tuned model "ManojINaik/Strength_weakness". The model is optimized for text generation with 4-bit quantization for efficient inference. ## API Endpoints ### GET / Health check endpoint that confirms the API is running. ### POST /generate/ Generate text based on a prompt with optional parameters. #### Request Body ```json { "prompt": "What are the strengths of Python?", "history": [], // Optional: List of previous conversation messages "system_prompt": "You are a very powerful AI assistant.", // Optional "max_length": 200, // Optional: Maximum length of generated text "temperature": 0.7 // Optional: Controls randomness (0.0 to 1.0) } ``` #### Response ```json { "response": "Generated text response..." } ``` ## Model Details - Base Model: ManojINaik/Strength_weakness - Quantization: 4-bit quantization using bitsandbytes - Device: Automatically uses GPU if available, falls back to CPU - Memory Efficient: Uses device mapping for optimal resource utilization ## Technical Details - Framework: FastAPI - Python Version: 3.9+ - Key Dependencies: - transformers - torch - bitsandbytes - accelerate - peft ## Example Usage ```python import requests url = "https://your-space-name.hf.space/generate" payload = { "prompt": "What are the strengths of Python?", "temperature": 0.7, "max_length": 200 } response = requests.post(url, json=payload) print(response.json()["response"]) ```