19 1 4

Bhargav Solanki

solankibhargav

AI & ML interests

None yet

Recent Activity

reacted to m-ric's post with 👍 6 days ago

𝐇𝐮𝐠𝐠𝐢𝐧𝐠 𝐅𝐚𝐜𝐞 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬 𝐏𝐢𝐜𝐨𝐭𝐫𝐨𝐧, 𝐚 𝐦𝐢𝐜𝐫𝐨𝐬𝐜𝐨𝐩𝐢𝐜 𝐥𝐢𝐛 𝐭𝐡𝐚𝐭 𝐬𝐨𝐥𝐯𝐞𝐬 𝐋𝐋𝐌 𝐭𝐫𝐚𝐢𝐧𝐢𝐧𝐠 𝟒𝐃 𝐩𝐚𝐫𝐚𝐥𝐥𝐞𝐥𝐢𝐳𝐚𝐭𝐢𝐨𝐧 🥳 🕰️ Llama-3.1-405B took 39 million GPU-hours to train, i.e. about 4.5 thousand years. 👴🏻 If they had needed all this time, we would have GPU stories from the time of Pharaoh 𓂀: "Alas, Lord of Two Lands, the shipment of counting-stones arriving from Cathay was lost to pirates, this shall delay the building of your computing temple by many moons " 🛠️ But instead, they just parallelized the training on 24k H100s, which made it take just a few months. This required parallelizing across 4 dimensions: data, tensor, context, pipeline. And it is infamously hard to do, making for bloated code repos that hold together only by magic. 🤏 𝗕𝘂𝘁 𝗻𝗼𝘄 𝘄𝗲 𝗱𝗼𝗻'𝘁 𝗻𝗲𝗲𝗱 𝗵𝘂𝗴𝗲 𝗿𝗲𝗽𝗼𝘀 𝗮𝗻𝘆𝗺𝗼𝗿𝗲! Instead of building mega-training codes, Hugging Face colleagues cooked in the other direction, towards tiny 4D parallelism libs. A team has built Nanotron, already widely used in industry. And now a team releases Picotron, a radical approach to code 4D Parallelism in just a few hundred lines of code, a real engineering prowess, making it much easier to understand what's actually happening! ⚡ 𝗜𝘁'𝘀 𝘁𝗶𝗻𝘆, 𝘆𝗲𝘁 𝗽𝗼𝘄𝗲𝗿𝗳𝘂𝗹: Counting in MFU (Model FLOPs Utilization, how much the model actually uses all the compute potential), this lib reaches ~50% on SmolLM-1.7B model with 8 H100 GPUs, which is really close to what huge libs would reach. (Caution: the team is leading further benchmarks to verify this) Go take a look 👉 https://github.com/huggingface/picotron/tree/main/picotron

new activity 23 days ago

llava-hf/vip-llava-13b-hf:Support for vllm/lmdeploy?

new activity 23 days ago

THUDM/glm-edge-v-5b:No support on vllm/lmdeploy

View all activity

Organizations

solankibhargav's activity

New activity in llava-hf/vip-llava-13b-hf 23 days ago

Support for vllm/lmdeploy?

#1 opened 23 days ago by

solankibhargav

New activity in THUDM/glm-edge-v-5b 23 days ago

No support on vllm/lmdeploy

#1 opened 23 days ago by

solankibhargav

New activity in google/gemma-2-27b-it 28 days ago

Does mac book pro m3max 48GB can load gemma2-27b?

#23 opened 6 months ago by

omenlyd

New activity in stepfun-ai/GOT-OCR2_0 3 months ago

Bounding boxes in results

#9 opened 3 months ago by

solankibhargav

New activity in ICTNLP/Llama-3.1-8B-Omni 3 months ago

Support for other languages and 70b model?

#6 opened 3 months ago by

solankibhargav

New activity in OpenGVLab/InternVL2-Llama3-76B 4 months ago

Request for support on faster inference engine

#10 opened 4 months ago by

solankibhargav

device error when using 76B.

#9 opened 4 months ago by

puar-playground

New activity in AbacusResearch/Jallabi-34B 4 months ago

Adding Evaluation Results

#2 opened 5 months ago by

leaderboard-pr-bot

New activity in cognitivecomputations/dolphin-vision-72b 6 months ago

Steps to fine tune?

#5 opened 6 months ago by

solankibhargav

New activity in meta-llama/Meta-Llama-3-70B-Instruct 6 months ago

Please add <|eot_id|> as a stop token to the HF config

#12 opened 8 months ago by

omarkilani

New activity in mistralai/Codestral-22B-v0.1 7 months ago

How to load in multi-gpu instance ?

#19 opened 7 months ago by

aastha6

New activity in qresearch/llama-3-vision-alpha 8 months ago

Can you share the steps of how mm_projecter was trained?

#5 opened 8 months ago by

solankibhargav

New activity in AbacusResearch/haLLawa4-7b 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

leaderboard-pr-bot

New activity in AbacusResearch/jaLLAbi 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

leaderboard-pr-bot

New activity in AbacusResearch/jaLLAbi2-7b 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

leaderboard-pr-bot

New activity in AbacusResearch/haLLAwa3 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

leaderboard-pr-bot

New activity in AbacusResearch/Jallabi-34B 10 months ago

Adding Evaluation Results

#1 opened 10 months ago by

leaderboard-pr-bot

New activity in liuhaotian/llava-v1.6-34b-tokenizer 10 months ago

When i use this for tokenizer and Autoprocessor, I run into cuda error in transformers Library.

#1 opened 10 months ago by

solankibhargav

New activity in AbacusResearch/haLLAwa2 11 months ago

Adding Evaluation Results

#1 opened 11 months ago by

leaderboard-pr-bot