File size: 6,709 Bytes
09b13b3
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101

![Hugging Face x Scikit-learn](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hfxsklearn.png)

In this sprint, we will build interactive demos from the scikit-learn documentation and, afterwards, contribute the demos directly to the docs.

## Important Dates

🌅 Sprint Start Date: Apr 12, 2023
🌃 Sprint Finish Date: Apr 30, 2023

## To get started 🤩

1. Join our [Discord](https://huggingface.co/join/discord) and take the role #sklearn-sprint-participant by selecting "Sklearn Working Group" in the #role-assignment channel. Then, meet us in #sklearn-sprint channel.
2. Head to [this page](https://scikit-learn.org/stable/auto_examples/) and pick an example you’d like to build on. 
3. Leave a comment on [this spreadsheet](https://docs.google.com/spreadsheets/d/14EThtIyF4KfpU99Fm2EW3Rz9t6SSEqDyzV4jmw3fjyI/edit?usp=sharing) with your name under Owner column, claiming the example. The spreadsheet has a limited number of examples. Feel free to add yours with a comment if it doesn’t exist in the spreadsheet.
.
4. Start building!
    
    We will be hosting our applications in [scikit-learn](https://huggingface.co/sklearn-docs) organization of Hugging Face. 
    
    For complete starters: in the Hugging Face Hub, there are repositories for models, datasets, and [Spaces](https://huggingface.co/spaces). Spaces are a special type of repository hosting ML applications, such as showcasing a model. To write our apps, we will only be using Gradio. [Gradio](https://gradio.app/) is a library that lets you build a cool front-end application for your models, completely in Python, and supports many libraries! In this sprint, we will be using mostly visualization support (`matplotlib`, `plotly`, `altair` and more) and [skops](https://skops.readthedocs.io/en/stable/) integration (which you can launch an interface for a given classification or regression interface with one line of code). 
    
    In Gradio, there are two ways to create a demo. One is to use `Interface`, which is a very simple abstraction. Let’s see an example.
    
    ```python
    import gradio as gr
    
    # implement your classifier here 
    clf.fit(X_train, y_train)

    def cancer_classifier(df):
        # simply infer and return predictions
        predictions = clf.predict(df)
        return predictions
    
    gr.Interface(fn=cancer_classifier, inputs="dataframe", 
    outputs="label").launch()
    
    # save this in a file called app.py
    # then run it 
    ```
    
    This will result in following interface:
    
    ![Simple Interface](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/interface.png)
    
    This is very customizable. You can specify rows and columns, add a title and description, an example input, and more. There’s a more detailed guide [here](https://gradio.app/using-gradio-for-tabular-workflows/). 
    
    Another way of creating an application is to use [Blocks](https://gradio.app/quickstart/#blocks-more-flexibility-and-control). You can see usage of Blocks in the example applications linked in this guide. 
    
    After we create our application, we will create a Space. You can go to [hf.co](http://huggingface.co), click on your profile on top right and select “New Space”.
    
    ![New Space](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/new_space.png)
    
    We can name our Space, pick a license and select Space SDK as “Gradio”. Free hardware is enough for our app, so no need to change it.
    
    ![Space Configuration](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png)
    
    After creating the Space, you have three options
     * You can clone the repository locally, add your files, and then push them to the Hub.
     * You can do all your coding directly in the browser.
     *  (shown below) You can do the coding locally and then drag and drop your application file to the Hub.
    
    ![Space Config](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_config.png)
    
    To upload your application file, pick “Add File” and drag and drop your file.
    
    ![New Space Landing](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/space_landing.png)
    
    Lastly, if your application includes any library other than Gradio, create a file called requirements.txt and add requirements like below: 
    
    ```python
    matplotlib==3.6.3
    scikit-learn==1.2.1
    ```
    
     And your app should be up and running!
    
    **Example Submissions**
    
    We left couple of examples below: (there’s more at the end of this page)
    Documentation page for comparing linkage methods for hierarchical clustering and example Space built on it 👇🏼 
    
    [Comparing different hierarchical linkage methods on toy datasets](https://scikit-learn.org/stable/auto_examples/cluster/plot_linkage_comparison.html#sphx-glr-auto-examples-cluster-plot-linkage-comparison-py)
    
    [Hierarchical Clustering Linkage - a Hugging Face Space by scikit-learn](https://huggingface.co/spaces/scikit-learn/hierarchical-clustering-linkage)
    
    Note: If for your demo you're training a model from scratch (e.g. training an image classifier), you can push it to the Hub using [skops](https://skops.readthedocs.io/en/stable/) and build a Gradio demo on top of it.  For such submission, we expect a model repository with a model card and the model weight as well as a simple Space with the interface that receives input and outputs results. You can use this tutorial to get started with [skops](https://www.kdnuggets.com/2023/02/skops-new-library-improve-scikitlearn-production.html).
    
    You can find an example submission for a model repository below.
    
    [scikit-learn/cancer-prediction-trees · Hugging Face](https://huggingface.co/scikit-learn/cancer-prediction-trees)
    
4. After the demos are done, we will open pull requests to scikit-learn documentation in [scikit-learn’s repository](https://github.com/scikit-learn/scikit-learn) to contribute our application codes to be directly inside the documentation. We will help you out if this is your first open source contribution. 🤗 

**If you need any help** you can join our discord server, take collaborate role and join `sklearn-sprint` channel and ask questions 🤗🫂 

### Sprint Prizes
We will be giving following vouchers that can be spent at [Hugging Face Store](https://store.huggingface.co/) including shipping,
- $20 worth of voucher for everyone that builds three demos,
- $40 worth of voucher for everyone that builds five demos.