# ๐Ÿ”ง AI Dataset Studio - Complete Troubleshooting Guide ## ๐Ÿšจ **Immediate Fix for Current Error** ### **Error: "DatasetStudio is not defined"** ``` NameError: name 'DatasetStudio' is not defined ``` โœ… **SOLUTION:** Replace your current `app.py` with the **complete fixed version** I provided above. **Quick Fix Steps:** 1. **Replace app.py** - Use the complete version from the artifacts above 2. **Add missing files** - Download all the files I've provided 3. **Restart your Space** - The error will be resolved --- ## ๐Ÿ“ **Files You Need (Complete Checklist)** | File | Status | Purpose | |------|--------|---------| | โœ… `app.py` | **Replace yours** | Main application (complete version) | | โŒ `app_minimal.py` | **Missing** | Fallback version (basic deps only) | | โœ… `requirements.txt` | **Have it** | Dependencies | | โœ… `README.md` | **Have it** | Documentation | | โœ… `config.py` | **Have it** | Configuration | | โŒ `utils.py` | **Incomplete** | Utility functions | | โŒ `startup.py` | **Missing** | Smart launcher | | โŒ `TROUBLESHOOTING.md` | **Missing** | This guide | --- ## ๐Ÿš€ **Quick Deployment Options** ### **Option 1: Immediate Fix (Recommended)** ```bash # Use the complete app.py I provided above # This fixes the DatasetStudio error immediately ``` ### **Option 2: Minimal Version (Guaranteed to Work)** ```bash # Use app_minimal.py as your main app.py # This version works with basic dependencies only ``` ### **Option 3: Smart Startup (Auto-Detect)** ```bash # Use startup.py as your main app.py # Automatically chooses the best version to run ``` --- ## ๐Ÿ” **Common Issues & Solutions** ### **Issue 1: Missing Dependencies** ``` ModuleNotFoundError: No module named 'transformers' ModuleNotFoundError: No module named 'bs4' ``` โœ… **SOLUTIONS:** #### **A. Minimal Installation (Fastest)** ```bash pip install gradio pandas requests beautifulsoup4 # Use app_minimal.py ``` #### **B. Full Installation** ```bash pip install gradio pandas requests beautifulsoup4 transformers torch nltk datasets # Use app.py (full version) ``` #### **C. Update requirements.txt** ```txt gradio>=4.44.0 pandas>=2.0.0 requests>=2.31.0 beautifulsoup4>=4.12.0 ``` --- ### **Issue 2: Slow Loading** ``` Application taking too long to start Models downloading... ``` โœ… **SOLUTIONS:** - **Use CPU Basic hardware initially** (loads faster) - **Try minimal version first** (no AI model downloads) - **Upgrade to T4 Small** for faster AI model loading --- ### **Issue 3: Memory Issues** ``` CUDA out of memory Application crashed ``` โœ… **SOLUTIONS:** - **Start with CPU Basic** (free, lower memory) - **Use minimal version** (smaller memory footprint) - **Upgrade gradually** (CPU โ†’ T4 โ†’ A10G as needed) --- ### **Issue 4: Import Errors** ``` Failed to import DatasetStudio Module not found errors ``` โœ… **SOLUTIONS:** - **Replace app.py** with the complete version above - **Add all missing files** from the artifacts - **Clear browser cache** and refresh --- ## ๐Ÿฅ **Emergency Fixes** ### **Nuclear Option: Start Completely Fresh** 1. **Create new Space** 2. **Use minimal files only:** ``` - app_minimal.py (rename to app.py) - requirements.txt (basic only) - README.md ``` 3. **Set hardware to CPU Basic** 4. **Test basic functionality first** 5. **Gradually add features** ### **Quick Test Commands** ```bash # Test basic imports python -c "import gradio, pandas, requests; print('โœ… Basic imports work')" # Test BeautifulSoup python -c "from bs4 import BeautifulSoup; print('โœ… BeautifulSoup works')" # Test full app (if using complete version) python -c "from app import DatasetStudio; print('โœ… DatasetStudio works')" ``` --- ## ๐Ÿ“Š **Version Comparison** | Feature | Minimal | Full | Smart | |---------|---------|------|-------| | **Dependencies** | 4 packages | 8+ packages | Auto-detect | | **Startup Time** | 30 seconds | 2-5 minutes | Variable | | **Web Scraping** | โœ… Basic | โœ… Advanced | โœ… Auto | | **AI Features** | โŒ None | โœ… All | โœ… If available | | **Export Formats** | JSON, CSV | All formats | Auto | | **Memory Usage** | ~100MB | ~2GB | Variable | | **Reliability** | ๐ŸŸข High | ๐ŸŸก Medium | ๐ŸŸข High | --- ## ๐ŸŽฏ **Deployment Strategy** ### **Step 1: Start Simple** ```yaml Files: app_minimal.py โ†’ app.py, requirements.txt (minimal) Hardware: CPU Basic Goal: Verify basic functionality ``` ### **Step 2: Add Features** ```yaml Files: Add complete app.py, config.py, utils.py Hardware: CPU Upgrade Goal: Test advanced features ``` ### **Step 3: Full Power** ```yaml Files: All files Hardware: T4 Small or higher Goal: Production deployment ``` --- ## ๐Ÿ”„ **Troubleshooting Workflow** ``` 1. ๐Ÿšจ ERROR OCCURS โ†“ 2. ๐Ÿ” CHECK THIS GUIDE โ†“ 3. ๐Ÿ› ๏ธ APPLY QUICK FIX โ†“ 4. ๐Ÿงช TEST SOLUTION โ†“ 5. โœ… SUCCESS OR โฌ†๏ธ ESCALATE ``` ### **Escalation Path:** 1. **Try minimal version** โ†’ `app_minimal.py` 2. **Check dependencies** โ†’ Install missing packages 3. **Review logs** โ†’ Look for specific errors 4. **Contact support** โ†’ Provide error details --- ## ๐Ÿ’ก **Pro Tips** ### **Development Best Practices** - โœ… **Start minimal, add complexity gradually** - โœ… **Test locally before deploying** - โœ… **Use version control for file management** - โœ… **Monitor Space logs for errors** ### **Performance Optimization** - โœ… **CPU Basic for development/testing** - โœ… **T4 Small for production** - โœ… **Enable persistent storage for large datasets** - โœ… **Use minimal version when possible** ### **Reliability Tips** - โœ… **Always have a fallback version ready** - โœ… **Test with sample URLs before large batches** - โœ… **Monitor Space analytics for usage patterns** - โœ… **Keep dependencies up to date** --- ## ๐Ÿ†˜ **Getting Help** ### **Information to Include When Asking for Help:** ``` 1. Exact error message 2. Files you're using (app.py vs app_minimal.py) 3. Hardware type (CPU Basic, T4 Small, etc.) 4. Dependencies installed 5. Space logs (if available) ``` ### **Quick Health Check Script:** ```python import sys print(f"Python: {sys.version}") try: import gradio print(f"โœ… Gradio: {gradio.__version__}") except ImportError: print("โŒ Gradio not available") try: from bs4 import BeautifulSoup print("โœ… BeautifulSoup available") except ImportError: print("โŒ BeautifulSoup not available") try: from app import DatasetStudio print("โœ… DatasetStudio available") except ImportError as e: print(f"โŒ DatasetStudio error: {e}") ``` --- ## ๐ŸŽ‰ **Success Indicators** You'll know everything is working when you see: ``` ๐Ÿš€ Starting AI Dataset Studio... ๐Ÿ“Š Features: โœ… AI Models | โœ… Advanced NLP | โœ… HuggingFace Integration โœ… DatasetStudio initialized successfully โœ… Interface created successfully Running on local URL: http://0.0.0.0:7860 ``` **If you see this, you're ready to create amazing datasets!** ๐ŸŽฏ --- ## ๐Ÿ“ž **Support Channels** - ๐Ÿ“– **Documentation**: README.md in your Space - ๐Ÿ’ฌ **Community**: HuggingFace Discussions - ๐Ÿ› **Bug Reports**: Include logs and error details - ๐Ÿ“ง **Direct Help**: Describe your setup and error **Remember: Every issue has a solution - start with the minimal version and build up!** ๐Ÿ’ช