File size: 6,709 Bytes
09b13b3 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 |
![Hugging Face x Scikit-learn](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hfxsklearn.png)
In this sprint, we will build interactive demos from the scikit-learn documentation and, afterwards, contribute the demos directly to the docs.
## Important Dates
🌅 Sprint Start Date: Apr 12, 2023
🌃 Sprint Finish Date: Apr 30, 2023
## To get started 🤩
1. Join our [Discord](https://huggingface.co/join/discord) and take the role #sklearn-sprint-participant by selecting "Sklearn Working Group" in the #role-assignment channel. Then, meet us in #sklearn-sprint channel.
2. Head to [this page](https://scikit-learn.org/stable/auto_examples/) and pick an example you’d like to build on.
3. Leave a comment on [this spreadsheet](https://docs.google.com/spreadsheets/d/14EThtIyF4KfpU99Fm2EW3Rz9t6SSEqDyzV4jmw3fjyI/edit?usp=sharing) with your name under Owner column, claiming the example. The spreadsheet has a limited number of examples. Feel free to add yours with a comment if it doesn’t exist in the spreadsheet.
.
4. Start building!
We will be hosting our applications in [scikit-learn](https://huggingface.co/sklearn-docs) organization of Hugging Face.
For complete starters: in the Hugging Face Hub, there are repositories for models, datasets, and [Spaces](https://huggingface.co/spaces). Spaces are a special type of repository hosting ML applications, such as showcasing a model. To write our apps, we will only be using Gradio. [Gradio](https://gradio.app/) is a library that lets you build a cool front-end application for your models, completely in Python, and supports many libraries! In this sprint, we will be using mostly visualization support (`matplotlib`, `plotly`, `altair` and more) and [skops](https://skops.readthedocs.io/en/stable/) integration (which you can launch an interface for a given classification or regression interface with one line of code).
In Gradio, there are two ways to create a demo. One is to use `Interface`, which is a very simple abstraction. Let’s see an example.
```python
import gradio as gr
# implement your classifier here
clf.fit(X_train, y_train)
def cancer_classifier(df):
# simply infer and return predictions
predictions = clf.predict(df)
return predictions
gr.Interface(fn=cancer_classifier, inputs="dataframe",
outputs="label").launch()
# save this in a file called app.py
# then run it
```
This will result in following interface:
![Simple Interface](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/interface.png)
This is very customizable. You can specify rows and columns, add a title and description, an example input, and more. There’s a more detailed guide [here](https://gradio.app/using-gradio-for-tabular-workflows/).
Another way of creating an application is to use [Blocks](https://gradio.app/quickstart/#blocks-more-flexibility-and-control). You can see usage of Blocks in the example applications linked in this guide.
After we create our application, we will create a Space. You can go to [hf.co](http://huggingface.co), click on your profile on top right and select “New Space”.
![New Space](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/new_space.png)
We can name our Space, pick a license and select Space SDK as “Gradio”. Free hardware is enough for our app, so no need to change it.
![Space Configuration](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png)
After creating the Space, you have three options
* You can clone the repository locally, add your files, and then push them to the Hub.
* You can do all your coding directly in the browser.
* (shown below) You can do the coding locally and then drag and drop your application file to the Hub.
![Space Config](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png)
To upload your application file, pick “Add File” and drag and drop your file.
![New Space Landing](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_landing.png)
Lastly, if your application includes any library other than Gradio, create a file called requirements.txt and add requirements like below:
```python
matplotlib==3.6.3
scikit-learn==1.2.1
```
And your app should be up and running!
**Example Submissions**
We left couple of examples below: (there’s more at the end of this page)
Documentation page for comparing linkage methods for hierarchical clustering and example Space built on it 👇🏼
[Comparing different hierarchical linkage methods on toy datasets](https://scikit-learn.org/stable/auto_examples/cluster/plot_linkage_comparison.html#sphx-glr-auto-examples-cluster-plot-linkage-comparison-py)
[Hierarchical Clustering Linkage - a Hugging Face Space by scikit-learn](https://huggingface.co/spaces/scikit-learn/hierarchical-clustering-linkage)
Note: If for your demo you're training a model from scratch (e.g. training an image classifier), you can push it to the Hub using [skops](https://skops.readthedocs.io/en/stable/) and build a Gradio demo on top of it. For such submission, we expect a model repository with a model card and the model weight as well as a simple Space with the interface that receives input and outputs results. You can use this tutorial to get started with [skops](https://www.kdnuggets.com/2023/02/skops-new-library-improve-scikitlearn-production.html).
You can find an example submission for a model repository below.
[scikit-learn/cancer-prediction-trees · Hugging Face](https://huggingface.co/scikit-learn/cancer-prediction-trees)
4. After the demos are done, we will open pull requests to scikit-learn documentation in [scikit-learn’s repository](https://github.com/scikit-learn/scikit-learn) to contribute our application codes to be directly inside the documentation. We will help you out if this is your first open source contribution. 🤗
**If you need any help** you can join our discord server, take collaborate role and join `sklearn-sprint` channel and ask questions 🤗🫂
### Sprint Prizes
We will be giving following vouchers that can be spent at [Hugging Face Store](https://store.huggingface.co/) including shipping,
- $20 worth of voucher for everyone that builds three demos,
- $40 worth of voucher for everyone that builds five demos. |