Customer Support Chatbot with LLaMA 3.1

An end-to-end customer support chatbot solution powered by fine-tuned LLaMA 3.1 8B model, deployed using Flask, Docker, and AWS ECS.

Overview

This project implements a sophisticated customer support chatbot leveraging the LLaMA 3.1 8B model fine-tuned on customer support conversations. The solution uses LoRA fine-tuning and various quantization techniques for optimized inference, deployed as a containerized application on AWS ECS with Fargate.

Features

  • Fine-tuned LLaMA 3.1 Model: Customized for customer support using the Bitext customer support dataset
  • Optimized Inference: Implements 4-bit, 8-bit, and 16-bit quantization
  • Containerized Deployment: Docker-based deployment for consistency and scalability
  • Cloud Infrastructure: Hosted on AWS ECS with Fargate for serverless container management
  • CI/CD Pipeline: Automated deployment using AWS CodePipeline
  • Monitoring: Comprehensive logging and monitoring via AWS CloudWatch

Model Details

The fine-tuned model is hosted on Hugging Face:

Tech Stack

  • Backend: Flask API
  • Model Serving: Ollama
  • Containerization: Docker
  • Cloud Services:
    • AWS ECS (Fargate)
    • AWS CodePipeline
    • AWS CloudWatch
  • Model Training: LoRA, Quantization

Screenshots

Chatbot Interface

Chatbot SS

Chatbot SS2

AWS CloudWatch Monitoring

CloudWatch SS

Docker Logs

Docker ss Docker ss2

AWS Deployment

  1. Push Docker image to Amazon ECR
  2. Configure AWS ECS Task Definition
  3. Set up AWS CodePipeline for CI/CD
  4. Configure CloudWatch monitoring

Uploaded model

  • Developed by: praneethposina
  • License: apache-2.0
  • Finetuned from model : unsloth/llama-3-8b-bnb-4bit
Downloads last month
81
GGUF
Model size
8.03B params
Architecture
llama

4-bit

5-bit

8-bit

16-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for praneethposina/customer_support_bot

Quantized
(737)
this model

Dataset used to train praneethposina/customer_support_bot