praneethposina commited on
Commit
e97ed61
·
verified ·
1 Parent(s): 290e614

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +62 -21
README.md CHANGED
@@ -1,21 +1,62 @@
1
- ---
2
- base_model: unsloth/llama-3-8b-bnb-4bit
3
- language:
4
- - en
5
- license: apache-2.0
6
- tags:
7
- - text-generation-inference
8
- - transformers
9
- - unsloth
10
- - llama
11
- - gguf
12
- datasets:
13
- - bitext/Bitext-customer-support-llm-chatbot-training-dataset
14
- pipeline_tag: text-generation
15
- ---
16
-
17
- # Uploaded model
18
-
19
- - **Developed by:** praneethposina
20
- - **License:** apache-2.0
21
- - **Finetuned from model :** unsloth/llama-3-8b-bnb-4bit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Customer Support Chatbot with LLaMA 3.1
2
+
3
+ > An end-to-end customer support chatbot solution powered by fine-tuned LLaMA 3.1 8B model, deployed using Flask, Docker, and AWS ECS.
4
+
5
+ ## Overview
6
+
7
+ This project implements a sophisticated customer support chatbot leveraging the LLaMA 3.1 8B model fine-tuned on customer support conversations. The solution uses LoRA fine-tuning and various quantization techniques for optimized inference, deployed as a containerized application on AWS ECS with Fargate.
8
+
9
+ ## Features
10
+
11
+ - **Fine-tuned LLaMA 3.1 Model**: Customized for customer support using the [Bitext customer support dataset](https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset)
12
+ - **Optimized Inference**: Implements 4-bit, 8-bit, and 16-bit quantization
13
+ - **Containerized Deployment**: Docker-based deployment for consistency and scalability
14
+ - **Cloud Infrastructure**: Hosted on AWS ECS with Fargate for serverless container management
15
+ - **CI/CD Pipeline**: Automated deployment using AWS CodePipeline
16
+ - **Monitoring**: Comprehensive logging and monitoring via AWS CloudWatch
17
+
18
+ ## Model Details
19
+
20
+ The fine-tuned model is hosted on Hugging Face:
21
+ - Model Repository: [praneethposina/customer_support_bot](https://huggingface.co/praneethposina/customer_support_bot)
22
+ - Github Repository: (https://github.com/praneethposina/Customer_Support_Chatbot)
23
+ - Base Model: LLaMA 3.1 8B
24
+ - Training Dataset: Bitext Customer Support Dataset
25
+ - Optimization: LoRA fine-tuning with quantization
26
+
27
+ ## Tech Stack
28
+
29
+ - **Backend**: Flask API
30
+ - **Model Serving**: Ollama
31
+ - **Containerization**: Docker
32
+ - **Cloud Services**:
33
+ - AWS ECS (Fargate)
34
+ - AWS CodePipeline
35
+ - AWS CloudWatch
36
+ - **Model Training**: LoRA, Quantization
37
+
38
+ ## Screenshots
39
+
40
+ ### Chatbot Interface
41
+
42
+ ![Chatbot SS](https://github.com/user-attachments/assets/220aea77-bb2b-4f50-b6a4-0541434d85ef)
43
+
44
+ ![Chatbot SS2](https://github.com/user-attachments/assets/da440735-59d7-4be7-a43d-d51de8983738)
45
+
46
+ ### AWS CloudWatch Monitoring
47
+
48
+ ![CloudWatch SS](https://github.com/user-attachments/assets/9794bc3e-4b9c-4626-9a7f-3936d4757328)
49
+
50
+ ### Docker Logs
51
+
52
+ <img width="1270" alt="Docker ss" src="https://github.com/user-attachments/assets/a72d1c35-8203-4a05-b944-743ea6c0a6b8" />
53
+ <img width="1268" alt="Docker ss2" src="https://github.com/user-attachments/assets/f1b0c0b1-2aad-462c-adf2-7a7ea9047a1a" />
54
+
55
+ ## AWS Deployment
56
+
57
+ 1. Push Docker image to Amazon ECR
58
+ 2. Configure AWS ECS Task Definition
59
+ 3. Set up AWS CodePipeline for CI/CD
60
+ 4. Configure CloudWatch monitoring
61
+
62
+