Spaces:
Running
Running
[](#downloading-files)Downloading files | |
======================================= | |
[](#download-a-single-file)Download a single file | |
------------------------------------------------- | |
### [](#huggingface_hub.hf_hub_download)hf\_hub\_download | |
#### huggingface\_hub.hf\_hub\_download | |
[](#huggingface_hub.hf_hub_download)[< source \>](https://github.com/huggingface/huggingface_hub/blob/v0.29.2/src/huggingface_hub/file_download.py#L663) | |
( repo\_id: strfilename: strsubfolder: typing.Optional\[str\] = Nonerepo\_type: typing.Optional\[str\] = Nonerevision: typing.Optional\[str\] = Nonelibrary\_name: typing.Optional\[str\] = Nonelibrary\_version: typing.Optional\[str\] = Nonecache\_dir: typing.Union\[str, pathlib.Path, NoneType\] = Nonelocal\_dir: typing.Union\[str, pathlib.Path, NoneType\] = Noneuser\_agent: typing.Union\[typing.Dict, str, NoneType\] = Noneforce\_download: bool = Falseproxies: typing.Optional\[typing.Dict\] = Noneetag\_timeout: float = 10token: typing.Union\[bool, str, NoneType\] = Nonelocal\_files\_only: bool = Falseheaders: typing.Optional\[typing.Dict\[str, str\]\] = Noneendpoint: typing.Optional\[str\] = Noneresume\_download: typing.Optional\[bool\] = Noneforce\_filename: typing.Optional\[str\] = Nonelocal\_dir\_use\_symlinks: typing.Union\[bool, typing.Literal\['auto'\]\] = 'auto' ) β export const metadata = 'undefined';`str` | |
Expand 16 parameters | |
Parameters | |
* [](#huggingface_hub.hf_hub_download.repo_id)**repo\_id** (`str`) β A user or an organization name and a repo name separated by a `/`. | |
* [](#huggingface_hub.hf_hub_download.filename)**filename** (`str`) β The name of the file in the repo. | |
* [](#huggingface_hub.hf_hub_download.subfolder)**subfolder** (`str`, _optional_) β An optional value corresponding to a folder inside the model repo. | |
* [](#huggingface_hub.hf_hub_download.repo_type)**repo\_type** (`str`, _optional_) β Set to `"dataset"` or `"space"` if downloading from a dataset or space, `None` or `"model"` if downloading from a model. Default is `None`. | |
* [](#huggingface_hub.hf_hub_download.revision)**revision** (`str`, _optional_) β An optional Git revision id which can be a branch name, a tag, or a commit hash. | |
* [](#huggingface_hub.hf_hub_download.library_name)**library\_name** (`str`, _optional_) β The name of the library to which the object corresponds. | |
* [](#huggingface_hub.hf_hub_download.library_version)**library\_version** (`str`, _optional_) β The version of the library. | |
* [](#huggingface_hub.hf_hub_download.cache_dir)**cache\_dir** (`str`, `Path`, _optional_) β Path to the folder where cached files are stored. | |
* [](#huggingface_hub.hf_hub_download.local_dir)**local\_dir** (`str` or `Path`, _optional_) β If provided, the downloaded file will be placed under this directory. | |
* [](#huggingface_hub.hf_hub_download.user_agent)**user\_agent** (`dict`, `str`, _optional_) β The user-agent info in the form of a dictionary or a string. | |
* [](#huggingface_hub.hf_hub_download.force_download)**force\_download** (`bool`, _optional_, defaults to `False`) β Whether the file should be downloaded even if it already exists in the local cache. | |
* [](#huggingface_hub.hf_hub_download.proxies)**proxies** (`dict`, _optional_) β Dictionary mapping protocol to the URL of the proxy passed to `requests.request`. | |
* [](#huggingface_hub.hf_hub_download.etag_timeout)**etag\_timeout** (`float`, _optional_, defaults to `10`) β When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to `requests.request`. | |
* [](#huggingface_hub.hf_hub_download.token)**token** (`str`, `bool`, _optional_) β A token to be used for the download. | |
* If `True`, the token is read from the HuggingFace config folder. | |
* If a string, itβs used as the authentication token. | |
* [](#huggingface_hub.hf_hub_download.local_files_only)**local\_files\_only** (`bool`, _optional_, defaults to `False`) β If `True`, avoid downloading the file and return the path to the local cached file if it exists. | |
* [](#huggingface_hub.hf_hub_download.headers)**headers** (`dict`, _optional_) β Additional headers to be sent with the request. | |
Returns | |
export const metadata = 'undefined'; | |
`str` | |
export const metadata = 'undefined'; | |
Local path of file or if networking is off, last version of file cached on disk. | |
Raises | |
export const metadata = 'undefined'; | |
[RepositoryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RepositoryNotFoundError) or [RevisionNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RevisionNotFoundError) or [EntryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.EntryNotFoundError) or [LocalEntryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.LocalEntryNotFoundError) or `EnvironmentError` or `OSError` or `ValueError` | |
export const metadata = 'undefined'; | |
* [RepositoryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RepositoryNotFoundError) β If the repository to download from cannot be found. This may be because it doesnβt exist, or because it is set to `private` and you do not have access. | |
* [RevisionNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RevisionNotFoundError) β If the revision to download from cannot be found. | |
* [EntryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.EntryNotFoundError) β If the file to download cannot be found. | |
* [LocalEntryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.LocalEntryNotFoundError) β If network is disabled or unavailable and file is not found in cache. | |
* [`EnvironmentError`](https://docs.python.org/3/library/exceptions.html#EnvironmentError) β If `token=True` but the token cannot be found. | |
* [`OSError`](https://docs.python.org/3/library/exceptions.html#OSError) β If ETag cannot be determined. | |
* [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError) β If some parameter value is invalid. | |
Download a given file if itβs not already present in the local cache. | |
The new cache file layout looks like this: | |
* The cache directory contains one subfolder per repo\_id (namespaced by repo type) | |
* inside each repo folder: | |
* refs is a list of the latest known revision => commit\_hash pairs | |
* blobs contains the actual file blobs (identified by their git-sha or sha256, depending on whether theyβre LFS files or not) | |
* snapshots contains one subfolder per commit, each βcommitβ contains the subset of the files that have been resolved at that particular commit. Each filename is a symlink to the blob at that particular commit. | |
[](#huggingface_hub.hf_hub_download.example) | |
Copied | |
\[ 96\] . | |
βββ \[ 160\] models\--julien-c--EsperBERTo-small | |
βββ \[ 160\] blobs | |
β βββ \[321M\] 403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd | |
β βββ \[ 398\] 7cb18dc9bafbfcf74629a4b760af1b160957a83e | |
β βββ \[1.4K\] d7edf6bd2a681fb0175f7735299831ee1b22b812 | |
βββ \[ 96\] refs | |
β βββ \[ 40\] main | |
βββ \[ 128\] snapshots | |
βββ \[ 128\] 2439f60ef33a0d46d85da5001d52aeda5b00ce9f | |
β βββ \[ 52\] README.md -> ../../blobs/d7edf6bd2a681fb0175f7735299831ee1b22b812 | |
β βββ \[ 76\] pytorch\_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd | |
βββ \[ 128\] bbc77c8132af1cc5cf678da3f1ddf2de43606d48 | |
βββ \[ 52\] README.md -> ../../blobs/7cb18dc9bafbfcf74629a4b760af1b160957a83e | |
βββ \[ 76\] pytorch\_model.bin -> ../../blobs/403450e234d65943a7dcf7e05a771ce3c92faa84dd07db4ac20f592037a1e4bd | |
If `local_dir` is provided, the file structure from the repo will be replicated in this location. When using this option, the `cache_dir` will not be used and a `.cache/huggingface/` folder will be created at the root of `local_dir` to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, itβs optimized for regularly pulling the latest version of a repository. | |
### [](#huggingface_hub.hf_hub_url)hf\_hub\_url | |
#### huggingface\_hub.hf\_hub\_url | |
[](#huggingface_hub.hf_hub_url)[< source \>](https://github.com/huggingface/huggingface_hub/blob/v0.29.2/src/huggingface_hub/file_download.py#L171) | |
( repo\_id: strfilename: strsubfolder: typing.Optional\[str\] = Nonerepo\_type: typing.Optional\[str\] = Nonerevision: typing.Optional\[str\] = Noneendpoint: typing.Optional\[str\] = None ) | |
Parameters | |
* [](#huggingface_hub.hf_hub_url.repo_id)**repo\_id** (`str`) β A namespace (user or an organization) name and a repo name separated by a `/`. | |
* [](#huggingface_hub.hf_hub_url.filename)**filename** (`str`) β The name of the file in the repo. | |
* [](#huggingface_hub.hf_hub_url.subfolder)**subfolder** (`str`, _optional_) β An optional value corresponding to a folder inside the repo. | |
* [](#huggingface_hub.hf_hub_url.repo_type)**repo\_type** (`str`, _optional_) β Set to `"dataset"` or `"space"` if downloading from a dataset or space, `None` or `"model"` if downloading from a model. Default is `None`. | |
* [](#huggingface_hub.hf_hub_url.revision)**revision** (`str`, _optional_) β An optional Git revision id which can be a branch name, a tag, or a commit hash. | |
Construct the URL of a file from the given information. | |
The resolved address can either be a huggingface.co-hosted url, or a link to Cloudfront (a Content Delivery Network, or CDN) for large files which are more than a few MBs. | |
[](#huggingface_hub.hf_hub_url.example) | |
Example: | |
Copied | |
\>>> from huggingface\_hub import hf\_hub\_url | |
\>>> hf\_hub\_url( | |
... repo\_id="julien-c/EsperBERTo-small", filename="pytorch\_model.bin" | |
... ) | |
'https://huggingface.co/julien-c/EsperBERTo-small/resolve/main/pytorch\_model.bin' | |
Notes: | |
Cloudfront is replicated over the globe so downloads are way faster for the end user (and it also lowers our bandwidth costs). | |
Cloudfront aggressively caches files by default (default TTL is 24 hours), however this is not an issue here because we implement a git-based versioning system on huggingface.co, which means that we store the files on S3/Cloudfront in a content-addressable way (i.e., the file name is its hash). Using content-addressable filenames means cache canβt ever be stale. | |
In terms of client-side caching from this library, we base our caching on the objectsβ entity tag (`ETag`), which is an identifier of a specific version of a resource \[1\]\_. An objectβs ETag is: its git-sha1 if stored in git, or its sha256 if stored in git-lfs. | |
References: | |
* \[1\] [https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/ETag) | |
[](#huggingface_hub.snapshot_download)Download a snapshot of the repo | |
--------------------------------------------------------------------- | |
#### huggingface\_hub.snapshot\_download | |
[](#huggingface_hub.snapshot_download)[< source \>](https://github.com/huggingface/huggingface_hub/blob/v0.29.2/src/huggingface_hub/_snapshot_download.py#L20) | |
( repo\_id: strrepo\_type: typing.Optional\[str\] = Nonerevision: typing.Optional\[str\] = Nonecache\_dir: typing.Union\[str, pathlib.Path, NoneType\] = Nonelocal\_dir: typing.Union\[str, pathlib.Path, NoneType\] = Nonelibrary\_name: typing.Optional\[str\] = Nonelibrary\_version: typing.Optional\[str\] = Noneuser\_agent: typing.Union\[typing.Dict, str, NoneType\] = Noneproxies: typing.Optional\[typing.Dict\] = Noneetag\_timeout: float = 10force\_download: bool = Falsetoken: typing.Union\[bool, str, NoneType\] = Nonelocal\_files\_only: bool = Falseallow\_patterns: typing.Union\[typing.List\[str\], str, NoneType\] = Noneignore\_patterns: typing.Union\[typing.List\[str\], str, NoneType\] = Nonemax\_workers: int = 8tqdm\_class: typing.Optional\[tqdm.asyncio.tqdm\_asyncio\] = Noneheaders: typing.Optional\[typing.Dict\[str, str\]\] = Noneendpoint: typing.Optional\[str\] = Nonelocal\_dir\_use\_symlinks: typing.Union\[bool, typing.Literal\['auto'\]\] = 'auto'resume\_download: typing.Optional\[bool\] = None ) β export const metadata = 'undefined';`str` | |
Expand 18 parameters | |
Parameters | |
* [](#huggingface_hub.snapshot_download.repo_id)**repo\_id** (`str`) β A user or an organization name and a repo name separated by a `/`. | |
* [](#huggingface_hub.snapshot_download.repo_type)**repo\_type** (`str`, _optional_) β Set to `"dataset"` or `"space"` if downloading from a dataset or space, `None` or `"model"` if downloading from a model. Default is `None`. | |
* [](#huggingface_hub.snapshot_download.revision)**revision** (`str`, _optional_) β An optional Git revision id which can be a branch name, a tag, or a commit hash. | |
* [](#huggingface_hub.snapshot_download.cache_dir)**cache\_dir** (`str`, `Path`, _optional_) β Path to the folder where cached files are stored. | |
* [](#huggingface_hub.snapshot_download.local_dir)**local\_dir** (`str` or `Path`, _optional_) β If provided, the downloaded files will be placed under this directory. | |
* [](#huggingface_hub.snapshot_download.library_name)**library\_name** (`str`, _optional_) β The name of the library to which the object corresponds. | |
* [](#huggingface_hub.snapshot_download.library_version)**library\_version** (`str`, _optional_) β The version of the library. | |
* [](#huggingface_hub.snapshot_download.user_agent)**user\_agent** (`str`, `dict`, _optional_) β The user-agent info in the form of a dictionary or a string. | |
* [](#huggingface_hub.snapshot_download.proxies)**proxies** (`dict`, _optional_) β Dictionary mapping protocol to the URL of the proxy passed to `requests.request`. | |
* [](#huggingface_hub.snapshot_download.etag_timeout)**etag\_timeout** (`float`, _optional_, defaults to `10`) β When fetching ETag, how many seconds to wait for the server to send data before giving up which is passed to `requests.request`. | |
* [](#huggingface_hub.snapshot_download.force_download)**force\_download** (`bool`, _optional_, defaults to `False`) β Whether the file should be downloaded even if it already exists in the local cache. | |
* [](#huggingface_hub.snapshot_download.token)**token** (`str`, `bool`, _optional_) β A token to be used for the download. | |
* If `True`, the token is read from the HuggingFace config folder. | |
* If a string, itβs used as the authentication token. | |
* [](#huggingface_hub.snapshot_download.headers)**headers** (`dict`, _optional_) β Additional headers to include in the request. Those headers take precedence over the others. | |
* [](#huggingface_hub.snapshot_download.local_files_only)**local\_files\_only** (`bool`, _optional_, defaults to `False`) β If `True`, avoid downloading the file and return the path to the local cached file if it exists. | |
* [](#huggingface_hub.snapshot_download.allow_patterns)**allow\_patterns** (`List[str]` or `str`, _optional_) β If provided, only files matching at least one pattern are downloaded. | |
* [](#huggingface_hub.snapshot_download.ignore_patterns)**ignore\_patterns** (`List[str]` or `str`, _optional_) β If provided, files matching any of the patterns are not downloaded. | |
* [](#huggingface_hub.snapshot_download.max_workers)**max\_workers** (`int`, _optional_) β Number of concurrent threads to download files (1 thread = 1 file download). Defaults to 8. | |
* [](#huggingface_hub.snapshot_download.tqdm_class)**tqdm\_class** (`tqdm`, _optional_) β If provided, overwrites the default behavior for the progress bar. Passed argument must inherit from `tqdm.auto.tqdm` or at least mimic its behavior. Note that the `tqdm_class` is not passed to each individual download. Defaults to the custom HF progress bar that can be disabled by setting `HF_HUB_DISABLE_PROGRESS_BARS` environment variable. | |
Returns | |
export const metadata = 'undefined'; | |
`str` | |
export const metadata = 'undefined'; | |
folder path of the repo snapshot. | |
Raises | |
export const metadata = 'undefined'; | |
[RepositoryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RepositoryNotFoundError) or [RevisionNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RevisionNotFoundError) or `EnvironmentError` or `OSError` or `ValueError` | |
export const metadata = 'undefined'; | |
* [RepositoryNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RepositoryNotFoundError) β If the repository to download from cannot be found. This may be because it doesnβt exist, or because it is set to `private` and you do not have access. | |
* [RevisionNotFoundError](/docs/huggingface_hub/v0.29.2/en/package_reference/utilities#huggingface_hub.errors.RevisionNotFoundError) β If the revision to download from cannot be found. | |
* [`EnvironmentError`](https://docs.python.org/3/library/exceptions.html#EnvironmentError) β If `token=True` and the token cannot be found. | |
* [`OSError`](https://docs.python.org/3/library/exceptions.html#OSError) β if ETag cannot be determined. | |
* [`ValueError`](https://docs.python.org/3/library/exceptions.html#ValueError) β if some parameter value is invalid. | |
Download repo files. | |
Download a whole snapshot of a repoβs files at the specified revision. This is useful when you want all files from a repo, because you donβt know which ones you will need a priori. All files are nested inside a folder in order to keep their actual filename relative to that folder. You can also filter which files to download using `allow_patterns` and `ignore_patterns`. | |
If `local_dir` is provided, the file structure from the repo will be replicated in this location. When using this option, the `cache_dir` will not be used and a `.cache/huggingface/` folder will be created at the root of `local_dir` to store some metadata related to the downloaded files. While this mechanism is not as robust as the main cache-system, itβs optimized for regularly pulling the latest version of a repository. | |
An alternative would be to clone the repo but this requires git and git-lfs to be installed and properly configured. It is also not possible to filter which files to download when cloning a repository using git. | |
[](#get-metadata-about-a-file)Get metadata about a file | |
------------------------------------------------------- | |
### [](#huggingface_hub.get_hf_file_metadata)get\_hf\_file\_metadata | |
#### huggingface\_hub.get\_hf\_file\_metadata | |
[](#huggingface_hub.get_hf_file_metadata)[< source \>](https://github.com/huggingface/huggingface_hub/blob/v0.29.2/src/huggingface_hub/file_download.py#L1246) | |
( url: strtoken: typing.Union\[bool, str, NoneType\] = Noneproxies: typing.Optional\[typing.Dict\] = Nonetimeout: typing.Optional\[float\] = 10library\_name: typing.Optional\[str\] = Nonelibrary\_version: typing.Optional\[str\] = Noneuser\_agent: typing.Union\[typing.Dict, str, NoneType\] = Noneheaders: typing.Optional\[typing.Dict\[str, str\]\] = None ) | |
Parameters | |
* [](#huggingface_hub.get_hf_file_metadata.url)**url** (`str`) β File url, for example returned by [hf\_hub\_url()](/docs/huggingface_hub/v0.29.2/en/package_reference/file_download#huggingface_hub.hf_hub_url). | |
* [](#huggingface_hub.get_hf_file_metadata.token)**token** (`str` or `bool`, _optional_) β A token to be used for the download. | |
* If `True`, the token is read from the HuggingFace config folder. | |
* If `False` or `None`, no token is provided. | |
* If a string, itβs used as the authentication token. | |
* [](#huggingface_hub.get_hf_file_metadata.proxies)**proxies** (`dict`, _optional_) β Dictionary mapping protocol to the URL of the proxy passed to `requests.request`. | |
* [](#huggingface_hub.get_hf_file_metadata.timeout)**timeout** (`float`, _optional_, defaults to 10) β How many seconds to wait for the server to send metadata before giving up. | |
* [](#huggingface_hub.get_hf_file_metadata.library_name)**library\_name** (`str`, _optional_) β The name of the library to which the object corresponds. | |
* [](#huggingface_hub.get_hf_file_metadata.library_version)**library\_version** (`str`, _optional_) β The version of the library. | |
* [](#huggingface_hub.get_hf_file_metadata.user_agent)**user\_agent** (`dict`, `str`, _optional_) β The user-agent info in the form of a dictionary or a string. | |
* [](#huggingface_hub.get_hf_file_metadata.headers)**headers** (`dict`, _optional_) β Additional headers to be sent with the request. | |
Fetch metadata of a file versioned on the Hub for a given url. | |
### [](#huggingface_hub.HfFileMetadata)HfFileMetadata | |
### class huggingface\_hub.HfFileMetadata | |
[](#huggingface_hub.HfFileMetadata)[< source \>](https://github.com/huggingface/huggingface_hub/blob/v0.29.2/src/huggingface_hub/file_download.py#L147) | |
( commit\_hash: typing.Optional\[str\]etag: typing.Optional\[str\]location: strsize: typing.Optional\[int\] ) | |
Parameters | |
* [](#huggingface_hub.HfFileMetadata.commit_hash)**commit\_hash** (`str`, _optional_) β The commit\_hash related to the file. | |
* [](#huggingface_hub.HfFileMetadata.etag)**etag** (`str`, _optional_) β Etag of the file on the server. | |
* [](#huggingface_hub.HfFileMetadata.location)**location** (`str`) β Location where to download the file. Can be a Hub url or not (CDN). | |
* [](#huggingface_hub.HfFileMetadata.size)**size** (`size`) β Size of the file. In case of an LFS file, contains the size of the actual LFS file, not the pointer. | |
Data structure containing information about a file versioned on the Hub. | |
Returned by [get\_hf\_file\_metadata()](/docs/huggingface_hub/v0.29.2/en/package_reference/file_download#huggingface_hub.get_hf_file_metadata) based on a URL. | |
[](#caching)Caching | |
------------------- | |
The methods displayed above are designed to work with a caching system that prevents re-downloading files. The caching system was updated in v0.8.0 to become the central cache-system shared across libraries that depend on the Hub. | |
Read the [cache-system guide](../guides/manage-cache) for a detailed presentation of caching at at HF. | |
[< \> Update on GitHub](https://github.com/huggingface/huggingface_hub/blob/main/docs/source/en/package_reference/file_download.md) | |
HfApi Client | |
[βHugging Face Hub API](/docs/huggingface_hub/en/package_reference/hf_api) [Mixins & serialization methodsβ](/docs/huggingface_hub/en/package_reference/mixins) |