Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
@@ -1,11 +1,5 @@
|
|
1 |
import streamlit as st
|
2 |
-
# %%
|
3 |
-
Datasets installation
|
4 |
-
! pip install datasets transformers
|
5 |
-
To install from source instead of the last release, comment the command above and uncomment the following one.
|
6 |
-
! pip install git+https://github.com/huggingface/datasets.git
|
7 |
|
8 |
-
%% [markdown]
|
9 |
# Quickstart
|
10 |
|
11 |
%% [markdown]
|
@@ -36,44 +30,7 @@ Check out [Chapter 5](https://huggingface.co/course/chapter5/1?fw=pt) of the Hug
|
|
36 |
|
37 |
</Tip>
|
38 |
|
39 |
-
Start by installing 🤗 Datasets:
|
40 |
-
|
41 |
-
```bash
|
42 |
-
pip install datasets
|
43 |
-
```
|
44 |
-
|
45 |
-
🤗 Datasets also support audio and image data formats:
|
46 |
-
|
47 |
-
* To work with audio datasets, install the [Audio](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Audio) feature:
|
48 |
-
|
49 |
-
```bash
|
50 |
-
pip install datasets[audio]
|
51 |
-
```
|
52 |
-
|
53 |
-
* To work with image datasets, install the [Image](https://huggingface.co/docs/datasets/main/en/package_reference/main_classes#datasets.Image) feature:
|
54 |
-
|
55 |
-
```bash
|
56 |
-
pip install datasets[vision]
|
57 |
-
```
|
58 |
|
59 |
-
Besides 🤗 Datasets, make sure your preferred machine learning framework is installed:
|
60 |
-
|
61 |
-
```bash
|
62 |
-
pip install torch
|
63 |
-
```
|
64 |
-
```bash
|
65 |
-
pip install tensorflow
|
66 |
-
```
|
67 |
-
|
68 |
-
%% [markdown]
|
69 |
-
## Audio
|
70 |
-
|
71 |
-
%% [markdown]
|
72 |
-
Audio datasets are loaded just like text datasets. However, an audio dataset is preprocessed a bit differently. Instead of a tokenizer, you'll need a [feature extractor](https://huggingface.co/docs/transformers/main_classes/feature_extractor#feature-extractor). An audio input may also require resampling its sampling rate to match the sampling rate of the pretrained model you're using. In this quickstart, you'll prepare the [MInDS-14](https://huggingface.co/datasets/PolyAI/minds14) dataset for a model train on and classify the banking issue a customer is having.
|
73 |
-
|
74 |
-
**1**. Load the MInDS-14 dataset by providing the [load_dataset()](https://huggingface.co/docs/datasets/main/en/package_reference/loading_methods#datasets.load_dataset) function with the dataset name, dataset configuration (not all datasets will have a configuration), and a dataset split:
|
75 |
-
|
76 |
-
# %%
|
77 |
from datasets import load_dataset, Audio
|
78 |
|
79 |
dataset = load_dataset("PolyAI/minds14", "en-US", split="train", trust_remote_code=True)
|
|
|
1 |
import streamlit as st
|
|
|
|
|
|
|
|
|
|
|
2 |
|
|
|
3 |
# Quickstart
|
4 |
|
5 |
%% [markdown]
|
|
|
30 |
|
31 |
</Tip>
|
32 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
from datasets import load_dataset, Audio
|
35 |
|
36 |
dataset = load_dataset("PolyAI/minds14", "en-US", split="train", trust_remote_code=True)
|