Rajat Arya's picture

Rajat Arya


AI & ML interests

None yet

Recent Activity

View all activity


Hugging Face's profile picture Xet Team's profile picture

rajatarya's activity

replied to jsulz's post 7 days ago
view reply

Let's go! Get on the waitlist - can't wait to get you onboarded with Xet. From a customer onboarding last week, "Yeah, it's pretty seamless.... oh wait, that was fast."

reacted to jsulz's post with ❤️🚀 7 days ago
view post
It's finally here ❤️

Build faster than ever with lightning fast upload and download speeds starting today on the Hub ⚡

Xet storage is rolling out access across the Hub - join the waitlist here https://huggingface.co/join/xet

You can apply for yourself, or your entire organization. Head over to your account settings for more information or join anywhere you see the Xet logo on a repository you know.

Have questions? Join the conversation below 👇 or open a discussion on the Xet team page xet-team/README
upvoted an article about 1 month ago
view article

From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub

reacted to jsulz's post with 👀👍🔥 3 months ago
view post
Doing a lot of benchmarking and visualization work, which means I'm always searching for interesting repos in terms of file types, size, branches, and overall structure.

To help, I built a Space jsulz/repo-info that lets you search for any repo and get back:

- Treemap of the repository, color coded by file/directory size
- Repo branches and their size
- Cumulative size of different file types (e.g., the total size of all the safetensors in the repo)

And because I'm interested in how this will fit in our work to leverage content-defined chunking for versioning repos on the Hub
- https://huggingface.co/blog/from-files-to-chunks - everything has the number of chunks (1 chunk = 64KB) as well as the total size in bytes.

Some of the treemaps are pretty cool. Attached are black-forest-labs/FLUX.1-dev and for fun laion/laion-audio-preview (which has nearly 10k .tar files 🤯)

  • 2 replies
replied to cfahlgren1's post 4 months ago
view reply

Also - easily reference your datasets in your research, giving conference committees greater confidence in the reproducibility of your results (ex. share the dataset, the model, the paper, and the analysis - all on the Hub).

Research needs reproducibility, use HF Hub for collaboration & dissemination:

upvoted an article 5 months ago
view article

Improving Parquet Dedupe on Hugging Face Hub

upvoted 2 articles 7 months ago
view article

The 5 Most Under-Rated Tools on Hugging Face

view article

XetHub is joining Hugging Face!