BITShyd: AI-Enhanced Facial Expression & Mental Health Counseling
Project Summary
BITShyd is an innovative AI project developed at BITS Pilani, Hyderabad, designed to bridge the gap between facial expression recognition and mental health counseling. Using a combination of conversational AI fine-tuned on the Amod/mental_health_counseling_conversations dataset and real-time facial expression analysis, the model provides contextual responses tailored to the emotional state of the user. This empathetic AI supports mental health applications and interactive emotional support systems.
Key Features and Technology
- Facial Expression Recognition: Detects emotions (happy, sad, angry, surprised, etc.) using real-time video input, enhanced with
face-api.js
. - Conversational AI for Mental Health: The model generates empathetic, supportive responses in mental health contexts, using fine-tuning on the Amod dataset.
- Low-Rank Adaptation (LoRA) and Unsloth Fine-Tuning: These techniques streamline adaptation and latency, allowing use across hardware setups, from personal devices to cloud platforms.
Technical Overview
- Base Model: LLaMA 3.1 (8B parameters) on Hugging Face
- Dataset: Amod/mental_health_counseling_conversations
- Optimization: LoRA for lightweight tuning and Unsloth for improved inference latency
- Supported Use Cases: Virtual mental health counseling, empathetic AI interactions, real-time emotional analysis
Fine-Tuning Details
The model utilizes LoRA and Unsloth fine-tuning, which make it adaptable to a range of devices by reducing the resource load without compromising on performance:
- Parameter-Efficient Training: LoRA keeps the model’s core knowledge while adapting to new data, critical for sensitive applications like mental health.
- Latency Optimization: Unsloth helps reduce response time, essential for real-time interaction.
- Training Configurations:
Parameter | Value |
---|---|
Model Size | 8 Billion Parameters |
Epochs | 3 |
Learning Rate | 5e-5 |
Batch Size | 8 |
Dataset | Amod/mental_health_counseling_conversations |
Optimizations | LoRA and Unsloth |
Code for Facial Expression Recognition
Below is a snippet using face-api.js
for real-time expression detection and emoji overlay based on emotions detected:
const video = document.getElementById('video');
const expressionsToEmoji = { happy: '😊', sad: '😢', angry: '😠', surprised: '😮', fearful: '😨' };
async function loadModels() {
await faceapi.nets.tinyFaceDetector.loadFromUri('models');
await faceapi.nets.faceExpressionNet.loadFromUri('models');
await startVideo();
}
async function startVideo() {
const stream = await navigator.mediaDevices.getUserMedia({ video: true });
video.srcObject = stream;
}
async function detectFaces() {
const detections = await faceapi.detectAllFaces(video, new faceapi.TinyFaceDetectorOptions())
.withFaceExpressions();
// Render emoji based on emotion
detections.forEach(detection => {
const expressions = detection.expressions;
const topExpression = Object.keys(expressions).reduce((a, b) => expressions[a] > expressions[b] ? a : b);
const emoji = expressionsToEmoji[topExpression] || '🙂';
// Code to overlay emoji on video feed...
});
}
loadModels();
Pros & Cons of Fine-Tuning Techniques
Advantages
- Efficient Adaptability: LoRA enables fine-tuning with minimal additional parameters, making the model flexible for different datasets without extensive resources.
- Reduced Latency: Unsloth optimizes real-time response speed, vital for real-time interactions in mental health applications.
- Scalability: These techniques allow the model to run on a range of devices, expanding access to diverse user bases.
Disadvantages
- Limited Transfer to Drastically Different Domains: LoRA and Unsloth are efficient within the trained domain but may struggle with applications vastly different from the original training data.
- Computational Constraints: Although reduced, fine-tuning large models can still demand significant resources for high-quality performance in real-world applications.
Why Building Projects is Crucial Beyond DSA Skills
Building real-world projects like BITShyd is invaluable compared to simply mastering algorithms and data structures (DSA). While DSA provides foundational skills for coding efficiency, project-based learning fosters critical skills like problem-solving, innovation, and practical application. These skills are essential in building technology that interacts with the real world, like empathetic AI. In a 9-to-5 job, growth is often linear, but personal projects allow you to create, innovate, and potentially develop tech that impacts millions—especially in fields like AI, where emotional intelligence and empathy are groundbreaking.
BITShyd Project represents this mindset, and we encourage developers to focus on projects that break conventional boundaries and offer more than predefined answers to problems.I aim to encourage building, experimenting, and understanding the end-to-end process of creating impactful technology.
Future Enhancements
- Expanded Emotion Detection: Plan to integrate a broader set of emotional expressions and even body language for deeper analysis.
- Deployment Options: Considering integration with cloud platforms for broad accessibility and robust interaction.
- Error Handling: Addressing the recent challenge with the Ollama API, ensuring smooth integration for stable performance.
Canva Link to PPT https://www.canva.com/design/DAGWDV6SGIw/zCkKWQDUulSIS4acv5cKmg/view?utm_content=DAGWDV6SGIw&utm_campaign=designshare&utm_medium=link&utm_source=editor
- Downloads last month
- 9
Model tree for LOHAMEIT/BITShyd
Base model
meta-llama/Llama-3.1-8B