AI_Powered_Web_Scraper / TROUBLESHOOTING.md
MagicMeWizard's picture
Create TROUBLESHOOTING.md
1d9e7b0 verified

πŸ”§ AI Dataset Studio - Complete Troubleshooting Guide

🚨 Immediate Fix for Current Error

Error: "DatasetStudio is not defined"

NameError: name 'DatasetStudio' is not defined

βœ… SOLUTION: Replace your current app.py with the complete fixed version I provided above.

Quick Fix Steps:

  1. Replace app.py - Use the complete version from the artifacts above
  2. Add missing files - Download all the files I've provided
  3. Restart your Space - The error will be resolved

πŸ“ Files You Need (Complete Checklist)

File Status Purpose
βœ… app.py Replace yours Main application (complete version)
❌ app_minimal.py Missing Fallback version (basic deps only)
βœ… requirements.txt Have it Dependencies
βœ… README.md Have it Documentation
βœ… config.py Have it Configuration
❌ utils.py Incomplete Utility functions
❌ startup.py Missing Smart launcher
❌ TROUBLESHOOTING.md Missing This guide

πŸš€ Quick Deployment Options

Option 1: Immediate Fix (Recommended)

# Use the complete app.py I provided above
# This fixes the DatasetStudio error immediately

Option 2: Minimal Version (Guaranteed to Work)

# Use app_minimal.py as your main app.py
# This version works with basic dependencies only

Option 3: Smart Startup (Auto-Detect)

# Use startup.py as your main app.py
# Automatically chooses the best version to run

πŸ” Common Issues & Solutions

Issue 1: Missing Dependencies

ModuleNotFoundError: No module named 'transformers'
ModuleNotFoundError: No module named 'bs4'

βœ… SOLUTIONS:

A. Minimal Installation (Fastest)

pip install gradio pandas requests beautifulsoup4
# Use app_minimal.py

B. Full Installation

pip install gradio pandas requests beautifulsoup4 transformers torch nltk datasets
# Use app.py (full version)

C. Update requirements.txt

gradio>=4.44.0
pandas>=2.0.0
requests>=2.31.0
beautifulsoup4>=4.12.0

Issue 2: Slow Loading

Application taking too long to start
Models downloading...

βœ… SOLUTIONS:

  • Use CPU Basic hardware initially (loads faster)
  • Try minimal version first (no AI model downloads)
  • Upgrade to T4 Small for faster AI model loading

Issue 3: Memory Issues

CUDA out of memory
Application crashed

βœ… SOLUTIONS:

  • Start with CPU Basic (free, lower memory)
  • Use minimal version (smaller memory footprint)
  • Upgrade gradually (CPU β†’ T4 β†’ A10G as needed)

Issue 4: Import Errors

Failed to import DatasetStudio
Module not found errors

βœ… SOLUTIONS:

  • Replace app.py with the complete version above
  • Add all missing files from the artifacts
  • Clear browser cache and refresh

πŸ₯ Emergency Fixes

Nuclear Option: Start Completely Fresh

  1. Create new Space
  2. Use minimal files only:
    - app_minimal.py (rename to app.py)
    - requirements.txt (basic only)
    - README.md
    
  3. Set hardware to CPU Basic
  4. Test basic functionality first
  5. Gradually add features

Quick Test Commands

# Test basic imports
python -c "import gradio, pandas, requests; print('βœ… Basic imports work')"

# Test BeautifulSoup
python -c "from bs4 import BeautifulSoup; print('βœ… BeautifulSoup works')"

# Test full app (if using complete version)
python -c "from app import DatasetStudio; print('βœ… DatasetStudio works')"

πŸ“Š Version Comparison

Feature Minimal Full Smart
Dependencies 4 packages 8+ packages Auto-detect
Startup Time 30 seconds 2-5 minutes Variable
Web Scraping βœ… Basic βœ… Advanced βœ… Auto
AI Features ❌ None βœ… All βœ… If available
Export Formats JSON, CSV All formats Auto
Memory Usage ~100MB ~2GB Variable
Reliability 🟒 High 🟑 Medium 🟒 High

🎯 Deployment Strategy

Step 1: Start Simple

Files: app_minimal.py β†’ app.py, requirements.txt (minimal)
Hardware: CPU Basic
Goal: Verify basic functionality

Step 2: Add Features

Files: Add complete app.py, config.py, utils.py
Hardware: CPU Upgrade
Goal: Test advanced features

Step 3: Full Power

Files: All files
Hardware: T4 Small or higher
Goal: Production deployment

πŸ”„ Troubleshooting Workflow

1. 🚨 ERROR OCCURS
   ↓
2. πŸ” CHECK THIS GUIDE
   ↓
3. πŸ› οΈ APPLY QUICK FIX
   ↓
4. πŸ§ͺ TEST SOLUTION
   ↓
5. βœ… SUCCESS OR ⬆️ ESCALATE

Escalation Path:

  1. Try minimal version β†’ app_minimal.py
  2. Check dependencies β†’ Install missing packages
  3. Review logs β†’ Look for specific errors
  4. Contact support β†’ Provide error details

πŸ’‘ Pro Tips

Development Best Practices

  • βœ… Start minimal, add complexity gradually
  • βœ… Test locally before deploying
  • βœ… Use version control for file management
  • βœ… Monitor Space logs for errors

Performance Optimization

  • βœ… CPU Basic for development/testing
  • βœ… T4 Small for production
  • βœ… Enable persistent storage for large datasets
  • βœ… Use minimal version when possible

Reliability Tips

  • βœ… Always have a fallback version ready
  • βœ… Test with sample URLs before large batches
  • βœ… Monitor Space analytics for usage patterns
  • βœ… Keep dependencies up to date

πŸ†˜ Getting Help

Information to Include When Asking for Help:

1. Exact error message
2. Files you're using (app.py vs app_minimal.py)
3. Hardware type (CPU Basic, T4 Small, etc.)
4. Dependencies installed
5. Space logs (if available)

Quick Health Check Script:

import sys
print(f"Python: {sys.version}")

try:
    import gradio
    print(f"βœ… Gradio: {gradio.__version__}")
except ImportError:
    print("❌ Gradio not available")

try:
    from bs4 import BeautifulSoup
    print("βœ… BeautifulSoup available")
except ImportError:
    print("❌ BeautifulSoup not available")

try:
    from app import DatasetStudio
    print("βœ… DatasetStudio available")
except ImportError as e:
    print(f"❌ DatasetStudio error: {e}")

πŸŽ‰ Success Indicators

You'll know everything is working when you see:

πŸš€ Starting AI Dataset Studio...
πŸ“Š Features: βœ… AI Models | βœ… Advanced NLP | βœ… HuggingFace Integration
βœ… DatasetStudio initialized successfully
βœ… Interface created successfully
Running on local URL: http://0.0.0.0:7860

If you see this, you're ready to create amazing datasets! 🎯


πŸ“ž Support Channels

  • πŸ“– Documentation: README.md in your Space
  • πŸ’¬ Community: HuggingFace Discussions
  • πŸ› Bug Reports: Include logs and error details
  • πŸ“§ Direct Help: Describe your setup and error

Remember: Every issue has a solution - start with the minimal version and build up! πŸ’ͺ