lvwerra HF staff eliebak HF staff commited on
Commit
6fa4a17
·
verified ·
1 Parent(s): 5ae5113

add ressources 1 (#32)

Browse files

- add ressources 1 (31d2d7654c86abf91f87a59a26e5ade4a44a1825)


Co-authored-by: Elie Bakouch <[email protected]>

Files changed (1) hide show
  1. src/index.html +51 -1
src/index.html CHANGED
@@ -2464,6 +2464,11 @@
2464
  <a href="https://arxiv.org/abs/2312.11805"><strong>Gemini</strong></a>
2465
  <p>Presents Google's multimodal model architecture capable of processing text, images, audio, and video inputs.</p>
2466
  </div>
 
 
 
 
 
2467
 
2468
  <div>
2469
  <a href="https://arxiv.org/abs/2412.19437v1"><strong>DeepSeek-V3</strong></a>
@@ -2472,7 +2477,6 @@
2472
 
2473
 
2474
  <h3>Training Frameworks</h3>
2475
-
2476
  <div>
2477
  <a href="https://github.com/facebookresearch/fairscale/tree/main"><strong>FairScale</strong></a>
2478
  <p>PyTorch extension library for large-scale training, offering various parallelism and optimization techniques.</p>
@@ -2525,6 +2529,11 @@
2525
  <p>Comprehensive guide to understanding and optimizing GPU memory usage in PyTorch.</p>
2526
  </div>
2527
 
 
 
 
 
 
2528
  <div>
2529
  <a href="https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html"><strong>TensorBoard Profiler Tutorial</strong></a>
2530
  <p>Guide to using TensorBoard's profiling tools for PyTorch models.</p>
@@ -2586,6 +2595,11 @@
2586
  <a href="https://arxiv.org/abs/1710.03740"><strong>Mixed precision training</strong></a>
2587
  <p>Introduces mixed precision training techniques for deep learning models.</p>
2588
  </div>
 
 
 
 
 
2589
 
2590
  <h3>Hardware</h3>
2591
 
@@ -2603,6 +2617,11 @@
2603
  <a href="https://www.semianalysis.com/p/100000-h100-clusters-power-network"><strong>Semianalysis - 100k H100 cluster</strong></a>
2604
  <p>Analysis of large-scale H100 GPU clusters and their implications for AI infrastructure.</p>
2605
  </div>
 
 
 
 
 
2606
 
2607
  <h3>Others</h3>
2608
 
@@ -2630,6 +2649,37 @@
2630
  <a href="https://www.harmdevries.com/post/context-length/"><strong>Harm's blog for long context</strong></a>
2631
  <p>Investigation into long context training in terms of data and training cost.</p>
2632
  </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2633
 
2634
  <h2>Appendix</h2>
2635
 
 
2464
  <a href="https://arxiv.org/abs/2312.11805"><strong>Gemini</strong></a>
2465
  <p>Presents Google's multimodal model architecture capable of processing text, images, audio, and video inputs.</p>
2466
  </div>
2467
+
2468
+ <div>
2469
+ <a href="https://arxiv.org/abs/2407.21783"><strong>Llama 3</strong></a>
2470
+ <p>The Llama 3 Herd of Models</p>
2471
+ </div>
2472
 
2473
  <div>
2474
  <a href="https://arxiv.org/abs/2412.19437v1"><strong>DeepSeek-V3</strong></a>
 
2477
 
2478
 
2479
  <h3>Training Frameworks</h3>
 
2480
  <div>
2481
  <a href="https://github.com/facebookresearch/fairscale/tree/main"><strong>FairScale</strong></a>
2482
  <p>PyTorch extension library for large-scale training, offering various parallelism and optimization techniques.</p>
 
2529
  <p>Comprehensive guide to understanding and optimizing GPU memory usage in PyTorch.</p>
2530
  </div>
2531
 
2532
+ <div>
2533
+ <a href="https://huggingface.co/blog/train_memory"><strong>Memory profiling walkthrough on a simple example</strong></a>
2534
+ <p>Visualize and understand GPU memory in PyTorch.</p>
2535
+ </div>
2536
+
2537
  <div>
2538
  <a href="https://pytorch.org/tutorials/intermediate/tensorboard_profiler_tutorial.html"><strong>TensorBoard Profiler Tutorial</strong></a>
2539
  <p>Guide to using TensorBoard's profiling tools for PyTorch models.</p>
 
2595
  <a href="https://arxiv.org/abs/1710.03740"><strong>Mixed precision training</strong></a>
2596
  <p>Introduces mixed precision training techniques for deep learning models.</p>
2597
  </div>
2598
+
2599
+ <div>
2600
+ <a href="https://main-horse.github.io/posts/visualizing-6d/"><strong>@main_horse blog</strong></a>
2601
+ <p>Visualizing 6D Mesh Parallelism</p>
2602
+ </div>
2603
 
2604
  <h3>Hardware</h3>
2605
 
 
2617
  <a href="https://www.semianalysis.com/p/100000-h100-clusters-power-network"><strong>Semianalysis - 100k H100 cluster</strong></a>
2618
  <p>Analysis of large-scale H100 GPU clusters and their implications for AI infrastructure.</p>
2619
  </div>
2620
+
2621
+ <div>
2622
+ <a href="https://modal.com/gpu-glossary/readme"><strong>Modal GPU Glossary </strong></a>
2623
+ <p>CUDA docs for human</p>
2624
+ </div>
2625
 
2626
  <h3>Others</h3>
2627
 
 
2649
  <a href="https://www.harmdevries.com/post/context-length/"><strong>Harm's blog for long context</strong></a>
2650
  <p>Investigation into long context training in terms of data and training cost.</p>
2651
  </div>
2652
+
2653
+ <div>
2654
+ <a href="https://www.youtube.com/@GPUMODE/videos"><strong>GPU Mode</strong></a>
2655
+ <p>A GPU reading group and community.</p>
2656
+ </div>
2657
+
2658
+ <div>
2659
+ <a href="https://youtube.com/playlist?list=PLvtrkEledFjqOLuDB_9FWL3dgivYqc6-3&si=fKWPotx8BflLAUkf"><strong>EleutherAI Youtube channel</strong></a>
2660
+ <p>ML Scalability & Performance Reading Group</p>
2661
+ </div>
2662
+
2663
+ <div>
2664
+ <a href="https://jax-ml.github.io/scaling-book/"><strong>Google Jax Scaling book</strong></a>
2665
+ <p>How to Scale Your Model</p>
2666
+ </div>
2667
+
2668
+ <div>
2669
+ <a href="https://github.com/facebookresearch/capi/blob/main/fsdp.py"><strong>@fvsmassa & @TimDarcet FSDP</strong></a>
2670
+ <p>Standalone ~500 LoC FSDP implementation</p>
2671
+ </div>
2672
+
2673
+ <div>
2674
+ <a href="https://www.thonking.ai/"><strong>thonking.ai</strong></a>
2675
+ <p>Some of Horace He blogpost</p>
2676
+ </div>
2677
+
2678
+ <div>
2679
+ <a href="https://gordicaleksa.medium.com/eli5-flash-attention-5c44017022ad"><strong>Aleksa's ELI5 Flash Attention</strong></a>
2680
+ <p>Easy explanation of Flash Attention</p>
2681
+ </div>
2682
+
2683
 
2684
  <h2>Appendix</h2>
2685