AK

Ol9
·

AI & ML interests

None yet

Recent Activity

updated a model 15 days ago
Ol9/lora_model
updated a model 15 days ago
Ol9/absa_stf_v4
updated a model 15 days ago
Ol9/ABSA_16bit
View all activity

Organizations

None yet

Ol9's activity

reacted to artnitolog's post with 👍 about 1 month ago
view post
Post
2084
Today we are introducing YaFSDP, Yandex’s tool for efficient distributed LLM training. YaFSDP can be used in conjunction with huggingface workflows and is up to 25% faster compared to FSDP.

Learn more here: https://github.com/yandex/YaFSDP
reacted to artnitolog's post with 🤗🤝🚀🔥❤️👍 about 1 month ago
view post
Post
2543
Recently, we open-sourced YaFSDP, Yandex’s tool for efficient distributed training of LLMs.

Here are some of the key ideas used in YaFSDP to provide speedup and memory savings over FSDP:
• Allocate and utilize just two buffers throughout the transformer for all collected weights to circumvent the torch memory allocator;
• Gather small normalization layers at the beginning of the iteration and average the gradients only at the end;
• Move gradient division to the very end of the backward pass.

To learn more about how YaFSDP works, check out our latest blog post: https://medium.com/yandex/yafsdp-a-tool-for-faster-llm-training-and-optimized-gpu-utilization-is-no-632b7539f5b3