VideoModelStudio / docs /huggingface /Search the Hub.md
jbilcke-hf's picture
jbilcke-hf HF Staff
upgrade finetrainers + gradio
ecd5028
|
raw
history blame
2.65 kB

Search the Hub

In this tutorial, you will learn how to search models, datasets and spaces on the Hub using huggingface_hub.

How to list repositories ?

huggingface_hub library includes an HTTP client HfApi to interact with the Hub. Among other things, it can list models, datasets and spaces stored on the Hub:

Copied

>>> from huggingface_hub import HfApi >>> api = HfApi() >>> models = api.list_models()

The output of list_models() is an iterator over the models stored on the Hub.

Similarly, you can use list_datasets() to list datasets and list_spaces() to list Spaces.

How to filter repositories ?

Listing repositories is great but now you might want to filter your search. The list helpers have several attributes like:

  • filter
  • author
  • search

Let’s see an example to get all models on the Hub that does image classification, have been trained on the imagenet dataset and that runs with PyTorch.

Copied

models = hf_api.list_models( task="image-classification", library="pytorch", trained_dataset="imagenet", )

While filtering, you can also sort the models and take only the top results. For example, the following example fetches the top 5 most downloaded datasets on the Hub:

Copied

>>> list(list_datasets(sort="downloads", direction=-1, limit=5)) [DatasetInfo( id='argilla/databricks-dolly-15k-curated-en', author='argilla', sha='4dcd1dedbe148307a833c931b21ca456a1fc4281', last_modified=datetime.datetime(2023, 10, 2, 12, 32, 53, tzinfo=datetime.timezone.utc), private=False, downloads=8889377, (...)

To explore available filters on the Hub, visit models and datasets pages in your browser, search for some parameters and look at the values in the URL.

< > Update on GitHub

HfApi Client

←Repository Inference→