Spaces:
Running
updating issue
Hi John, with some profiles containing over 500 LoRAs/models, I find it challenging to download or update them because the one-hour limit is triggered too quickly. Typically, it downloads around 150โ200 files, and then after one hour, I try again. However, it appears to download and overwrite everything, so after reaching 250โ300 files, the one-hour limit is triggered again. This means the download rate for updating a profile is only about 50 files per hour.
Is there a way to check the already downloaded files and ignore them so that only missing files are downloaded instead of redownloading everything? I have a profile with over 1,000 LoRAs, and I've been updating it since it only had 100 LoRAs using the monthly tab. However, I find it challenging to download or update profiles with such a high volume of content.
I'll give it a try. For now, since the API for Hugging Face is well-developed, it's easy to check the hash in advance, but the issue was obtaining the SHA256 hash for Civitai. I managed to do that part today. I plan to work on the rest tomorrow.
import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry
import urllib
import re
def get_user_agent():
return 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:127.0) Gecko/20100101 Firefox/127.0'
def get_civitai_sha256(dl_url: str, api_key=""):
def is_invalid_file(qs: dict, json: dict, k: str):
return k in qs.keys() and qs[k][0] != json.get(k, None) and json.get(k, None) is not None
if "https://civitai.com/api/download/models/" not in dl_url: return None
user_agent = get_user_agent()
headers = {'User-Agent': user_agent, 'content-type': 'application/json'}
if api_key: headers['Authorization'] = f'Bearer {{{api_key}}}'
base_url = 'https://civitai.com/api/v1/model-versions/'
params = {}
session = requests.Session()
retries = Retry(total=5, backoff_factor=1, status_forcelist=[500, 502, 503, 504])
session.mount("https://", HTTPAdapter(max_retries=retries))
m = re.match(r'https://civitai.com/api/download/models/(\d+)\??(.+)?', dl_url)
if m is None: return None
url = base_url + m.group(1)
qs = urllib.parse.parse_qs(m.group(2))
if "type" not in qs.keys(): qs["type"] = ["Model"]
try:
r = session.get(url, params=params, headers=headers, stream=True, timeout=(5.0, 15))
if not r.ok: return None
json = dict(r.json())
if "files" not in json.keys() or not isinstance(json["files"], list): return None
hash = None
for d in json["files"]:
if is_invalid_file(qs, d, "type") or is_invalid_file(qs, d, "format") or is_invalid_file(qs, d, "size") or is_invalid_file(qs, d, "fp"): continue
hash = d["hashes"]["SHA256"].lower()
return hash
except Exception as e:
print(e)
return None
print(get_civitai_sha256("https://civitai.com/api/download/models/1335639"))
print(get_civitai_sha256("https://civitai.com/api/download/models/1335639?type=Model&format=SafeTensor"))
print(get_civitai_sha256("https://civitai.com/api/download/models/1335639?type=Training%20Data"))
I found a troublesome bug related to Transformers and Spaces, so I'll investigate that first.