File size: 7,200 Bytes
29eba28
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
# πŸ”§ HuggingFace Spaces Configuration Guide

**Essential configuration options for your AI Dataset Studio Space**

---

## πŸ“‹ **Required README.md Header**

Every HuggingFace Space **must** have this YAML frontmatter at the very beginning of README.md:

### **Basic Configuration (Recommended)**
```yaml
---
title: AI Dataset Studio
emoji: πŸš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
---
```

### **Alternative Configurations**

#### **Professional/Business Version**
```yaml
---
title: Enterprise Dataset Studio
emoji: 🏒
colorFrom: gray
colorTo: blue
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: true
license: mit
tags:
  - machine-learning
  - datasets
  - nlp
  - data-science
  - perplexity-ai
---
```

#### **Research/Academic Version**
```yaml
---
title: Research Dataset Creator
emoji: πŸŽ“
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: apache-2.0
tags:
  - research
  - academic
  - datasets
  - nlp
  - ai
---
```

#### **Creative/Colorful Version**
```yaml
---
title: AI Dataset Magic ✨
emoji: 🎨
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
tags:
  - datasets
  - creative
  - ai-tools
  - machine-learning
---
```

---

## 🎨 **Configuration Options Explained**

### **Required Fields**

| Field | Description | Example Values |
|-------|-------------|----------------|
| `title` | Space name displayed in UI | `AI Dataset Studio` |
| `emoji` | Icon shown next to title | `πŸš€`, `πŸ€–`, `πŸ“Š`, `🎯` |
| `colorFrom` | Gradient start color | `blue`, `red`, `green`, `purple` |
| `colorTo` | Gradient end color | `purple`, `pink`, `yellow`, `blue` |
| `sdk` | Framework used | `gradio` (for our app) |
| `sdk_version` | SDK version | `"4.44.0"` |
| `app_file` | Main application file | `app.py` |

### **Optional Fields**

| Field | Description | Example Values |
|-------|-------------|----------------|
| `pinned` | Pin to your profile | `true`, `false` |
| `license` | Software license | `mit`, `apache-2.0`, `gpl-3.0` |
| `tags` | Searchable keywords | `machine-learning`, `nlp`, `datasets` |
| `models` | Referenced models | `facebook/bart-large-cnn` |
| `datasets` | Referenced datasets | `imdb`, `sentiment140` |

---

## 🎯 **Popular Color Combinations**

### **Professional Themes**
```yaml
# Corporate Blue
colorFrom: blue
colorTo: indigo

# Business Gray
colorFrom: gray
colorTo: blue

# Tech Green
colorFrom: green
colorTo: teal
```

### **Creative Themes**
```yaml
# Sunset
colorFrom: orange
colorTo: red

# Ocean
colorFrom: blue
colorTo: cyan

# Forest
colorFrom: green
colorTo: yellow

# Galaxy
colorFrom: purple
colorTo: pink
```

### **AI/Tech Themes**
```yaml
# Matrix
colorFrom: green
colorTo: black

# Cyberpunk
colorFrom: purple
colorTo: blue

# Neural
colorFrom: blue
colorTo: purple
```

---

## 🏷️ **Recommended Tags**

### **For AI Dataset Studio**
```yaml
tags:
  - machine-learning
  - datasets
  - nlp
  - data-science
  - perplexity-ai
  - web-scraping
  - sentiment-analysis
  - text-classification
  - ai-tools
  - data-collection
```

### **By Use Case**

#### **Business/Enterprise**
```yaml
tags:
  - business-intelligence
  - enterprise
  - data-analytics
  - market-research
  - customer-insights
```

#### **Research/Academic**
```yaml
tags:
  - research
  - academic
  - scientific
  - literature-review
  - research-tools
```

#### **Developer Tools**
```yaml
tags:
  - developer-tools
  - api
  - automation
  - productivity
  - data-engineering
```

---

## πŸ“Š **Hardware Configuration**

The Space configuration also affects hardware selection:

### **Hardware Options**
```yaml
# In Space settings (not README.md):
# - CPU Basic (free)
# - CPU Upgrade ($0.03/hour)
# - T4 Small ($0.60/hour) ← Recommended
# - T4 Medium ($1.20/hour)
# - A10G Small ($1.05/hour)
# - A10G Large ($3.15/hour)
```

### **Memory Requirements**
```yaml
# Our application needs:
# - Base app: ~200MB
# - AI models: ~2-4GB
# - Processing: ~1-2GB
# Total: ~4-6GB recommended (T4 Small = 16GB)
```

---

## πŸ” **Environment Variables**

Set these in Space Settings β†’ Repository secrets:

### **Required**
```bash
PERPLEXITY_API_KEY = "your_perplexity_api_key_here"
```

### **Optional**
```bash
# HuggingFace integration
HF_TOKEN = "your_huggingface_token"

# Performance tuning
MAX_SOURCES_PER_SEARCH = "50"
REQUEST_TIMEOUT = "30"
LOG_LEVEL = "INFO"

# Feature flags
ENABLE_DEBUG_MODE = "false"
ENABLE_CACHING = "true"
```

---

## βœ… **Validation Checklist**

Before deploying, ensure:

- [ ] βœ… YAML frontmatter is at the very beginning of README.md
- [ ] βœ… No spaces before the opening `---`
- [ ] βœ… Proper YAML syntax (quotes around version numbers)
- [ ] βœ… `app_file: app.py` matches your main file name
- [ ] βœ… SDK version matches your requirements.txt
- [ ] βœ… Title and emoji are appropriate for your audience
- [ ] βœ… Tags are relevant and searchable
- [ ] βœ… PERPLEXITY_API_KEY is set in Space secrets

---

## 🚨 **Common Configuration Errors**

### **❌ Missing Frontmatter**
```markdown
# πŸš€ AI Dataset Studio  ← ERROR: No YAML header
```

### **βœ… Correct Format**
```markdown
---
title: AI Dataset Studio
emoji: πŸš€
sdk: gradio
---

# πŸš€ AI Dataset Studio  ← Correct: Content after YAML
```

### **❌ Wrong SDK Version Format**
```yaml
sdk_version: 4.44.0  ← ERROR: Missing quotes
```

### **βœ… Correct Format**
```yaml
sdk_version: "4.44.0"  ← Correct: Quoted string
```

### **❌ Invalid App File**
```yaml
app_file: main.py  ← ERROR: File doesn't exist
```

### **βœ… Correct Format**
```yaml
app_file: app.py  ← Correct: Matches actual filename
```

---

## πŸ”„ **Updating Configuration**

To change your Space configuration:

1. **Edit README.md**
   - Update the YAML frontmatter
   - Commit changes to git

2. **Space will automatically rebuild**
   - Changes take effect immediately
   - Monitor build logs for errors

3. **Hardware changes**
   - Go to Space Settings
   - Change hardware tier
   - Restart Space

---

## πŸŽ‰ **Example Complete README.md Start**

Here's how your README.md should begin:

```markdown
---
title: AI Dataset Studio
emoji: πŸš€
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: "4.44.0"
app_file: app.py
pinned: false
license: mit
tags:
  - machine-learning
  - datasets
  - nlp
  - perplexity-ai
  - data-science
---

# πŸš€ AI Dataset Studio

**Create high-quality training datasets with AI-powered source discovery**

A comprehensive platform for building ML datasets that combines web scraping, AI processing, and smart source discovery using Perplexity AI...
```

---

## πŸ’‘ **Pro Tips**

1. **Choose memorable titles** - They appear in search results
2. **Use relevant emojis** - They make your Space stand out
3. **Pick good color combinations** - They create visual appeal
4. **Add comprehensive tags** - They improve discoverability
5. **Pin important Spaces** - They appear prominently on your profile
6. **Use appropriate licenses** - MIT or Apache-2.0 for open source

---

**Your Space configuration is now properly set up for deployment! πŸš€**