facebook
/

KernelLLM

Text Generation

text-generation-inference

Model card Files Files and versions

Zacharias030 commited on 23 days ago

Commit

52bf86f

·

verified ·

1 Parent(s): 13e2283

Added Updates section

Files changed (1) hide show

README.md +7 -1

README.md CHANGED Viewed

@@ -12,6 +12,12 @@ library_name: transformers
 ![scatter performance comparison plot](media/llm_performance_comparison.png)
 *On KernelBench-Triton Level 1, our 8B parameter model exceeds models such as GPT-4o and DeepSeek V3 in single-shot performance. With multiple inferences, KernelLLM's performance outperforms DeepSeek R1. This is all from a model with two orders of magnitude fewer parameters than its competitors.*
 ## Making Kernel Development more accessible with KernelLLM
 We introduce KernelLLM, a large language model based on Llama 3.1 Instruct, which has been trained specifically for the task of authoring GPU kernels using Triton. KernelLLM translates PyTorch modules into Triton kernels and was evaluated on KernelBench-Triton (see [here](https://github.com/ScalingIntelligence/KernelBench/pull/35)).
@@ -181,7 +187,7 @@ Please see the Responsible Use Guide available at [https://ai.meta.com/llama/res
 ```
 @software{kernelllm2025,
-    title={KernelLLM},
     author={Fisches, Zacharias and Paliskara, Sahan and Guo, Simon and Zhang, Alex and Spisak, Joe and Cummins, Chris and Leather, Hugh and Isaacson, Joe and Markosyan, Aram and Saroufim, Mark},
     year={2025},
     month={5},

 ![scatter performance comparison plot](media/llm_performance_comparison.png)
 *On KernelBench-Triton Level 1, our 8B parameter model exceeds models such as GPT-4o and DeepSeek V3 in single-shot performance. With multiple inferences, KernelLLM's performance outperforms DeepSeek R1. This is all from a model with two orders of magnitude fewer parameters than its competitors.*
+## _Updates_:
+* 2025/06/25: We added an [end-to-end example walkthrough](https://huggingface.co/facebook/KernelLLM/discussions/5#685b0903b3d048882566b17b), where we format a community-provided prompt for KernelLLM to function well.
+We have received many questions about how to format the prompts such that KernelLLM performs best. We hope this can help!
+* 2025/06/15 We would like to thank the community for the creation of [multiple](https://huggingface.co/bartowski/facebook_KernelLLM-GGUF) [different](https://huggingface.co/unsloth/KernelLLM-GGUF) [quantizations](https://huggingface.co/unsloth/KernelLLM) and for a total of more than 20k downloads!
+* 2025/06/03 The startup mako.dev has integrated KernelLLM into their [GPU performance engineering plattform](https://generate.mako.dev/)!
 ## Making Kernel Development more accessible with KernelLLM
 We introduce KernelLLM, a large language model based on Llama 3.1 Instruct, which has been trained specifically for the task of authoring GPU kernels using Triton. KernelLLM translates PyTorch modules into Triton kernels and was evaluated on KernelBench-Triton (see [here](https://github.com/ScalingIntelligence/KernelBench/pull/35)).
 ```
 @software{kernelllm2025,
+    title={KernelLLM: Making Kernel Development More Accessible},
     author={Fisches, Zacharias and Paliskara, Sahan and Guo, Simon and Zhang, Alex and Spisak, Joe and Cummins, Chris and Leather, Hugh and Isaacson, Joe and Markosyan, Aram and Saroufim, Mark},
     year={2025},
     month={5},