|
# π§ AI Dataset Studio - Complete Troubleshooting Guide |
|
|
|
## π¨ **Immediate Fix for Current Error** |
|
|
|
### **Error: "DatasetStudio is not defined"** |
|
``` |
|
NameError: name 'DatasetStudio' is not defined |
|
``` |
|
|
|
β
**SOLUTION:** Replace your current `app.py` with the **complete fixed version** I provided above. |
|
|
|
**Quick Fix Steps:** |
|
1. **Replace app.py** - Use the complete version from the artifacts above |
|
2. **Add missing files** - Download all the files I've provided |
|
3. **Restart your Space** - The error will be resolved |
|
|
|
--- |
|
|
|
## π **Files You Need (Complete Checklist)** |
|
|
|
| File | Status | Purpose | |
|
|------|--------|---------| |
|
| β
`app.py` | **Replace yours** | Main application (complete version) | |
|
| β `app_minimal.py` | **Missing** | Fallback version (basic deps only) | |
|
| β
`requirements.txt` | **Have it** | Dependencies | |
|
| β
`README.md` | **Have it** | Documentation | |
|
| β
`config.py` | **Have it** | Configuration | |
|
| β `utils.py` | **Incomplete** | Utility functions | |
|
| β `startup.py` | **Missing** | Smart launcher | |
|
| β `TROUBLESHOOTING.md` | **Missing** | This guide | |
|
|
|
--- |
|
|
|
## π **Quick Deployment Options** |
|
|
|
### **Option 1: Immediate Fix (Recommended)** |
|
```bash |
|
# Use the complete app.py I provided above |
|
# This fixes the DatasetStudio error immediately |
|
``` |
|
|
|
### **Option 2: Minimal Version (Guaranteed to Work)** |
|
```bash |
|
# Use app_minimal.py as your main app.py |
|
# This version works with basic dependencies only |
|
``` |
|
|
|
### **Option 3: Smart Startup (Auto-Detect)** |
|
```bash |
|
# Use startup.py as your main app.py |
|
# Automatically chooses the best version to run |
|
``` |
|
|
|
--- |
|
|
|
## π **Common Issues & Solutions** |
|
|
|
### **Issue 1: Missing Dependencies** |
|
``` |
|
ModuleNotFoundError: No module named 'transformers' |
|
ModuleNotFoundError: No module named 'bs4' |
|
``` |
|
|
|
β
**SOLUTIONS:** |
|
|
|
#### **A. Minimal Installation (Fastest)** |
|
```bash |
|
pip install gradio pandas requests beautifulsoup4 |
|
# Use app_minimal.py |
|
``` |
|
|
|
#### **B. Full Installation** |
|
```bash |
|
pip install gradio pandas requests beautifulsoup4 transformers torch nltk datasets |
|
# Use app.py (full version) |
|
``` |
|
|
|
#### **C. Update requirements.txt** |
|
```txt |
|
gradio>=4.44.0 |
|
pandas>=2.0.0 |
|
requests>=2.31.0 |
|
beautifulsoup4>=4.12.0 |
|
``` |
|
|
|
--- |
|
|
|
### **Issue 2: Slow Loading** |
|
``` |
|
Application taking too long to start |
|
Models downloading... |
|
``` |
|
|
|
β
**SOLUTIONS:** |
|
- **Use CPU Basic hardware initially** (loads faster) |
|
- **Try minimal version first** (no AI model downloads) |
|
- **Upgrade to T4 Small** for faster AI model loading |
|
|
|
--- |
|
|
|
### **Issue 3: Memory Issues** |
|
``` |
|
CUDA out of memory |
|
Application crashed |
|
``` |
|
|
|
β
**SOLUTIONS:** |
|
- **Start with CPU Basic** (free, lower memory) |
|
- **Use minimal version** (smaller memory footprint) |
|
- **Upgrade gradually** (CPU β T4 β A10G as needed) |
|
|
|
--- |
|
|
|
### **Issue 4: Import Errors** |
|
``` |
|
Failed to import DatasetStudio |
|
Module not found errors |
|
``` |
|
|
|
β
**SOLUTIONS:** |
|
- **Replace app.py** with the complete version above |
|
- **Add all missing files** from the artifacts |
|
- **Clear browser cache** and refresh |
|
|
|
--- |
|
|
|
## π₯ **Emergency Fixes** |
|
|
|
### **Nuclear Option: Start Completely Fresh** |
|
|
|
1. **Create new Space** |
|
2. **Use minimal files only:** |
|
``` |
|
- app_minimal.py (rename to app.py) |
|
- requirements.txt (basic only) |
|
- README.md |
|
``` |
|
3. **Set hardware to CPU Basic** |
|
4. **Test basic functionality first** |
|
5. **Gradually add features** |
|
|
|
### **Quick Test Commands** |
|
```bash |
|
# Test basic imports |
|
python -c "import gradio, pandas, requests; print('β
Basic imports work')" |
|
|
|
# Test BeautifulSoup |
|
python -c "from bs4 import BeautifulSoup; print('β
BeautifulSoup works')" |
|
|
|
# Test full app (if using complete version) |
|
python -c "from app import DatasetStudio; print('β
DatasetStudio works')" |
|
``` |
|
|
|
--- |
|
|
|
## π **Version Comparison** |
|
|
|
| Feature | Minimal | Full | Smart | |
|
|---------|---------|------|-------| |
|
| **Dependencies** | 4 packages | 8+ packages | Auto-detect | |
|
| **Startup Time** | 30 seconds | 2-5 minutes | Variable | |
|
| **Web Scraping** | β
Basic | β
Advanced | β
Auto | |
|
| **AI Features** | β None | β
All | β
If available | |
|
| **Export Formats** | JSON, CSV | All formats | Auto | |
|
| **Memory Usage** | ~100MB | ~2GB | Variable | |
|
| **Reliability** | π’ High | π‘ Medium | π’ High | |
|
|
|
--- |
|
|
|
## π― **Deployment Strategy** |
|
|
|
### **Step 1: Start Simple** |
|
```yaml |
|
Files: app_minimal.py β app.py, requirements.txt (minimal) |
|
Hardware: CPU Basic |
|
Goal: Verify basic functionality |
|
``` |
|
|
|
### **Step 2: Add Features** |
|
```yaml |
|
Files: Add complete app.py, config.py, utils.py |
|
Hardware: CPU Upgrade |
|
Goal: Test advanced features |
|
``` |
|
|
|
### **Step 3: Full Power** |
|
```yaml |
|
Files: All files |
|
Hardware: T4 Small or higher |
|
Goal: Production deployment |
|
``` |
|
|
|
--- |
|
|
|
## π **Troubleshooting Workflow** |
|
|
|
``` |
|
1. π¨ ERROR OCCURS |
|
β |
|
2. π CHECK THIS GUIDE |
|
β |
|
3. π οΈ APPLY QUICK FIX |
|
β |
|
4. π§ͺ TEST SOLUTION |
|
β |
|
5. β
SUCCESS OR β¬οΈ ESCALATE |
|
``` |
|
|
|
### **Escalation Path:** |
|
1. **Try minimal version** β `app_minimal.py` |
|
2. **Check dependencies** β Install missing packages |
|
3. **Review logs** β Look for specific errors |
|
4. **Contact support** β Provide error details |
|
|
|
--- |
|
|
|
## π‘ **Pro Tips** |
|
|
|
### **Development Best Practices** |
|
- β
**Start minimal, add complexity gradually** |
|
- β
**Test locally before deploying** |
|
- β
**Use version control for file management** |
|
- β
**Monitor Space logs for errors** |
|
|
|
### **Performance Optimization** |
|
- β
**CPU Basic for development/testing** |
|
- β
**T4 Small for production** |
|
- β
**Enable persistent storage for large datasets** |
|
- β
**Use minimal version when possible** |
|
|
|
### **Reliability Tips** |
|
- β
**Always have a fallback version ready** |
|
- β
**Test with sample URLs before large batches** |
|
- β
**Monitor Space analytics for usage patterns** |
|
- β
**Keep dependencies up to date** |
|
|
|
--- |
|
|
|
## π **Getting Help** |
|
|
|
### **Information to Include When Asking for Help:** |
|
``` |
|
1. Exact error message |
|
2. Files you're using (app.py vs app_minimal.py) |
|
3. Hardware type (CPU Basic, T4 Small, etc.) |
|
4. Dependencies installed |
|
5. Space logs (if available) |
|
``` |
|
|
|
### **Quick Health Check Script:** |
|
```python |
|
import sys |
|
print(f"Python: {sys.version}") |
|
|
|
try: |
|
import gradio |
|
print(f"β
Gradio: {gradio.__version__}") |
|
except ImportError: |
|
print("β Gradio not available") |
|
|
|
try: |
|
from bs4 import BeautifulSoup |
|
print("β
BeautifulSoup available") |
|
except ImportError: |
|
print("β BeautifulSoup not available") |
|
|
|
try: |
|
from app import DatasetStudio |
|
print("β
DatasetStudio available") |
|
except ImportError as e: |
|
print(f"β DatasetStudio error: {e}") |
|
``` |
|
|
|
--- |
|
|
|
## π **Success Indicators** |
|
|
|
You'll know everything is working when you see: |
|
|
|
``` |
|
π Starting AI Dataset Studio... |
|
π Features: β
AI Models | β
Advanced NLP | β
HuggingFace Integration |
|
β
DatasetStudio initialized successfully |
|
β
Interface created successfully |
|
Running on local URL: http://0.0.0.0:7860 |
|
``` |
|
|
|
**If you see this, you're ready to create amazing datasets!** π― |
|
|
|
--- |
|
|
|
## π **Support Channels** |
|
|
|
- π **Documentation**: README.md in your Space |
|
- π¬ **Community**: HuggingFace Discussions |
|
- π **Bug Reports**: Include logs and error details |
|
- π§ **Direct Help**: Describe your setup and error |
|
|
|
**Remember: Every issue has a solution - start with the minimal version and build up!** πͺ |