MagicMeWizard commited on
Commit
29eba28
Β·
verified Β·
1 Parent(s): fdb51e4

Create SPACE_CONFIG.md

Browse files
Files changed (1) hide show
  1. SPACE_CONFIG.md +389 -0
SPACE_CONFIG.md ADDED
@@ -0,0 +1,389 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # πŸ”§ HuggingFace Spaces Configuration Guide
2
+
3
+ **Essential configuration options for your AI Dataset Studio Space**
4
+
5
+ ---
6
+
7
+ ## πŸ“‹ **Required README.md Header**
8
+
9
+ Every HuggingFace Space **must** have this YAML frontmatter at the very beginning of README.md:
10
+
11
+ ### **Basic Configuration (Recommended)**
12
+ ```yaml
13
+ ---
14
+ title: AI Dataset Studio
15
+ emoji: πŸš€
16
+ colorFrom: blue
17
+ colorTo: purple
18
+ sdk: gradio
19
+ sdk_version: "4.44.0"
20
+ app_file: app.py
21
+ pinned: false
22
+ ---
23
+ ```
24
+
25
+ ### **Alternative Configurations**
26
+
27
+ #### **Professional/Business Version**
28
+ ```yaml
29
+ ---
30
+ title: Enterprise Dataset Studio
31
+ emoji: 🏒
32
+ colorFrom: gray
33
+ colorTo: blue
34
+ sdk: gradio
35
+ sdk_version: "4.44.0"
36
+ app_file: app.py
37
+ pinned: true
38
+ license: mit
39
+ tags:
40
+ - machine-learning
41
+ - datasets
42
+ - nlp
43
+ - data-science
44
+ - perplexity-ai
45
+ ---
46
+ ```
47
+
48
+ #### **Research/Academic Version**
49
+ ```yaml
50
+ ---
51
+ title: Research Dataset Creator
52
+ emoji: πŸŽ“
53
+ colorFrom: green
54
+ colorTo: blue
55
+ sdk: gradio
56
+ sdk_version: "4.44.0"
57
+ app_file: app.py
58
+ pinned: false
59
+ license: apache-2.0
60
+ tags:
61
+ - research
62
+ - academic
63
+ - datasets
64
+ - nlp
65
+ - ai
66
+ ---
67
+ ```
68
+
69
+ #### **Creative/Colorful Version**
70
+ ```yaml
71
+ ---
72
+ title: AI Dataset Magic ✨
73
+ emoji: 🎨
74
+ colorFrom: pink
75
+ colorTo: purple
76
+ sdk: gradio
77
+ sdk_version: "4.44.0"
78
+ app_file: app.py
79
+ pinned: false
80
+ tags:
81
+ - datasets
82
+ - creative
83
+ - ai-tools
84
+ - machine-learning
85
+ ---
86
+ ```
87
+
88
+ ---
89
+
90
+ ## 🎨 **Configuration Options Explained**
91
+
92
+ ### **Required Fields**
93
+
94
+ | Field | Description | Example Values |
95
+ |-------|-------------|----------------|
96
+ | `title` | Space name displayed in UI | `AI Dataset Studio` |
97
+ | `emoji` | Icon shown next to title | `πŸš€`, `πŸ€–`, `πŸ“Š`, `🎯` |
98
+ | `colorFrom` | Gradient start color | `blue`, `red`, `green`, `purple` |
99
+ | `colorTo` | Gradient end color | `purple`, `pink`, `yellow`, `blue` |
100
+ | `sdk` | Framework used | `gradio` (for our app) |
101
+ | `sdk_version` | SDK version | `"4.44.0"` |
102
+ | `app_file` | Main application file | `app.py` |
103
+
104
+ ### **Optional Fields**
105
+
106
+ | Field | Description | Example Values |
107
+ |-------|-------------|----------------|
108
+ | `pinned` | Pin to your profile | `true`, `false` |
109
+ | `license` | Software license | `mit`, `apache-2.0`, `gpl-3.0` |
110
+ | `tags` | Searchable keywords | `machine-learning`, `nlp`, `datasets` |
111
+ | `models` | Referenced models | `facebook/bart-large-cnn` |
112
+ | `datasets` | Referenced datasets | `imdb`, `sentiment140` |
113
+
114
+ ---
115
+
116
+ ## 🎯 **Popular Color Combinations**
117
+
118
+ ### **Professional Themes**
119
+ ```yaml
120
+ # Corporate Blue
121
+ colorFrom: blue
122
+ colorTo: indigo
123
+
124
+ # Business Gray
125
+ colorFrom: gray
126
+ colorTo: blue
127
+
128
+ # Tech Green
129
+ colorFrom: green
130
+ colorTo: teal
131
+ ```
132
+
133
+ ### **Creative Themes**
134
+ ```yaml
135
+ # Sunset
136
+ colorFrom: orange
137
+ colorTo: red
138
+
139
+ # Ocean
140
+ colorFrom: blue
141
+ colorTo: cyan
142
+
143
+ # Forest
144
+ colorFrom: green
145
+ colorTo: yellow
146
+
147
+ # Galaxy
148
+ colorFrom: purple
149
+ colorTo: pink
150
+ ```
151
+
152
+ ### **AI/Tech Themes**
153
+ ```yaml
154
+ # Matrix
155
+ colorFrom: green
156
+ colorTo: black
157
+
158
+ # Cyberpunk
159
+ colorFrom: purple
160
+ colorTo: blue
161
+
162
+ # Neural
163
+ colorFrom: blue
164
+ colorTo: purple
165
+ ```
166
+
167
+ ---
168
+
169
+ ## 🏷️ **Recommended Tags**
170
+
171
+ ### **For AI Dataset Studio**
172
+ ```yaml
173
+ tags:
174
+ - machine-learning
175
+ - datasets
176
+ - nlp
177
+ - data-science
178
+ - perplexity-ai
179
+ - web-scraping
180
+ - sentiment-analysis
181
+ - text-classification
182
+ - ai-tools
183
+ - data-collection
184
+ ```
185
+
186
+ ### **By Use Case**
187
+
188
+ #### **Business/Enterprise**
189
+ ```yaml
190
+ tags:
191
+ - business-intelligence
192
+ - enterprise
193
+ - data-analytics
194
+ - market-research
195
+ - customer-insights
196
+ ```
197
+
198
+ #### **Research/Academic**
199
+ ```yaml
200
+ tags:
201
+ - research
202
+ - academic
203
+ - scientific
204
+ - literature-review
205
+ - research-tools
206
+ ```
207
+
208
+ #### **Developer Tools**
209
+ ```yaml
210
+ tags:
211
+ - developer-tools
212
+ - api
213
+ - automation
214
+ - productivity
215
+ - data-engineering
216
+ ```
217
+
218
+ ---
219
+
220
+ ## πŸ“Š **Hardware Configuration**
221
+
222
+ The Space configuration also affects hardware selection:
223
+
224
+ ### **Hardware Options**
225
+ ```yaml
226
+ # In Space settings (not README.md):
227
+ # - CPU Basic (free)
228
+ # - CPU Upgrade ($0.03/hour)
229
+ # - T4 Small ($0.60/hour) ← Recommended
230
+ # - T4 Medium ($1.20/hour)
231
+ # - A10G Small ($1.05/hour)
232
+ # - A10G Large ($3.15/hour)
233
+ ```
234
+
235
+ ### **Memory Requirements**
236
+ ```yaml
237
+ # Our application needs:
238
+ # - Base app: ~200MB
239
+ # - AI models: ~2-4GB
240
+ # - Processing: ~1-2GB
241
+ # Total: ~4-6GB recommended (T4 Small = 16GB)
242
+ ```
243
+
244
+ ---
245
+
246
+ ## πŸ” **Environment Variables**
247
+
248
+ Set these in Space Settings β†’ Repository secrets:
249
+
250
+ ### **Required**
251
+ ```bash
252
+ PERPLEXITY_API_KEY = "your_perplexity_api_key_here"
253
+ ```
254
+
255
+ ### **Optional**
256
+ ```bash
257
+ # HuggingFace integration
258
+ HF_TOKEN = "your_huggingface_token"
259
+
260
+ # Performance tuning
261
+ MAX_SOURCES_PER_SEARCH = "50"
262
+ REQUEST_TIMEOUT = "30"
263
+ LOG_LEVEL = "INFO"
264
+
265
+ # Feature flags
266
+ ENABLE_DEBUG_MODE = "false"
267
+ ENABLE_CACHING = "true"
268
+ ```
269
+
270
+ ---
271
+
272
+ ## βœ… **Validation Checklist**
273
+
274
+ Before deploying, ensure:
275
+
276
+ - [ ] βœ… YAML frontmatter is at the very beginning of README.md
277
+ - [ ] βœ… No spaces before the opening `---`
278
+ - [ ] βœ… Proper YAML syntax (quotes around version numbers)
279
+ - [ ] βœ… `app_file: app.py` matches your main file name
280
+ - [ ] βœ… SDK version matches your requirements.txt
281
+ - [ ] βœ… Title and emoji are appropriate for your audience
282
+ - [ ] βœ… Tags are relevant and searchable
283
+ - [ ] βœ… PERPLEXITY_API_KEY is set in Space secrets
284
+
285
+ ---
286
+
287
+ ## 🚨 **Common Configuration Errors**
288
+
289
+ ### **❌ Missing Frontmatter**
290
+ ```markdown
291
+ # πŸš€ AI Dataset Studio ← ERROR: No YAML header
292
+ ```
293
+
294
+ ### **βœ… Correct Format**
295
+ ```markdown
296
+ ---
297
+ title: AI Dataset Studio
298
+ emoji: πŸš€
299
+ sdk: gradio
300
+ ---
301
+
302
+ # πŸš€ AI Dataset Studio ← Correct: Content after YAML
303
+ ```
304
+
305
+ ### **❌ Wrong SDK Version Format**
306
+ ```yaml
307
+ sdk_version: 4.44.0 ← ERROR: Missing quotes
308
+ ```
309
+
310
+ ### **βœ… Correct Format**
311
+ ```yaml
312
+ sdk_version: "4.44.0" ← Correct: Quoted string
313
+ ```
314
+
315
+ ### **❌ Invalid App File**
316
+ ```yaml
317
+ app_file: main.py ← ERROR: File doesn't exist
318
+ ```
319
+
320
+ ### **βœ… Correct Format**
321
+ ```yaml
322
+ app_file: app.py ← Correct: Matches actual filename
323
+ ```
324
+
325
+ ---
326
+
327
+ ## πŸ”„ **Updating Configuration**
328
+
329
+ To change your Space configuration:
330
+
331
+ 1. **Edit README.md**
332
+ - Update the YAML frontmatter
333
+ - Commit changes to git
334
+
335
+ 2. **Space will automatically rebuild**
336
+ - Changes take effect immediately
337
+ - Monitor build logs for errors
338
+
339
+ 3. **Hardware changes**
340
+ - Go to Space Settings
341
+ - Change hardware tier
342
+ - Restart Space
343
+
344
+ ---
345
+
346
+ ## πŸŽ‰ **Example Complete README.md Start**
347
+
348
+ Here's how your README.md should begin:
349
+
350
+ ```markdown
351
+ ---
352
+ title: AI Dataset Studio
353
+ emoji: πŸš€
354
+ colorFrom: blue
355
+ colorTo: purple
356
+ sdk: gradio
357
+ sdk_version: "4.44.0"
358
+ app_file: app.py
359
+ pinned: false
360
+ license: mit
361
+ tags:
362
+ - machine-learning
363
+ - datasets
364
+ - nlp
365
+ - perplexity-ai
366
+ - data-science
367
+ ---
368
+
369
+ # πŸš€ AI Dataset Studio
370
+
371
+ **Create high-quality training datasets with AI-powered source discovery**
372
+
373
+ A comprehensive platform for building ML datasets that combines web scraping, AI processing, and smart source discovery using Perplexity AI...
374
+ ```
375
+
376
+ ---
377
+
378
+ ## πŸ’‘ **Pro Tips**
379
+
380
+ 1. **Choose memorable titles** - They appear in search results
381
+ 2. **Use relevant emojis** - They make your Space stand out
382
+ 3. **Pick good color combinations** - They create visual appeal
383
+ 4. **Add comprehensive tags** - They improve discoverability
384
+ 5. **Pin important Spaces** - They appear prominently on your profile
385
+ 6. **Use appropriate licenses** - MIT or Apache-2.0 for open source
386
+
387
+ ---
388
+
389
+ **Your Space configuration is now properly set up for deployment! πŸš€**