Customer Support Chatbot with LLaMA 3.1
An end-to-end customer support chatbot solution powered by fine-tuned LLaMA 3.1 8B model, deployed using Flask, Docker, and AWS ECS.
Overview
This project implements a sophisticated customer support chatbot leveraging the LLaMA 3.1 8B model fine-tuned on customer support conversations. The solution uses LoRA fine-tuning and various quantization techniques for optimized inference, deployed as a containerized application on AWS ECS with Fargate.
Features
- Fine-tuned LLaMA 3.1 Model: Customized for customer support using the Bitext customer support dataset
- Optimized Inference: Implements 4-bit, 8-bit, and 16-bit quantization
- Containerized Deployment: Docker-based deployment for consistency and scalability
- Cloud Infrastructure: Hosted on AWS ECS with Fargate for serverless container management
- CI/CD Pipeline: Automated deployment using AWS CodePipeline
- Monitoring: Comprehensive logging and monitoring via AWS CloudWatch
Model Details
The fine-tuned model is hosted on Hugging Face:
- Model Repository: praneethposina/customer_support_bot
- Github Repository: github.com/praneethposina/Customer_Support_Chatbot
- Base Model: LLaMA 3.1 8B
- Training Dataset: Bitext Customer Support Dataset
- Optimization: LoRA fine-tuning with quantization
Tech Stack
- Backend: Flask API
- Model Serving: Ollama
- Containerization: Docker
- Cloud Services:
- AWS ECS (Fargate)
- AWS CodePipeline
- AWS CloudWatch
- Model Training: LoRA, Quantization
Screenshots
Chatbot Interface
AWS CloudWatch Monitoring
Docker Logs
AWS Deployment
- Push Docker image to Amazon ECR
- Configure AWS ECS Task Definition
- Set up AWS CodePipeline for CI/CD
- Configure CloudWatch monitoring
Uploaded model
- Developed by: praneethposina
- License: apache-2.0
- Finetuned from model : unsloth/llama-3-8b-bnb-4bit
- Downloads last month
- 81
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for praneethposina/customer_support_bot
Base model
meta-llama/Meta-Llama-3-8B
Quantized
unsloth/llama-3-8b-bnb-4bit