apple
/

DiffuCoder-7B-Instruct

text-diffusion-model

diffusion large language model

Model card Files Files and versions Community

Create README.md

#1

by Sansa - opened 24 days ago

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +26 -0

README.md ADDED Viewed

	@@ -0,0 +1,26 @@

+---
+base_model:
+- apple/DiffuCoder-7B-Base
+tags:
+- code
+- text-diffusion-model
+- diffusion large language model
+license: unknown
+---
+### DiffuCoder-7B-Instruct
+The DiffuCoder-7B-Instruct model builds on the DiffuCoder-7B-Base checkpoint with instruction-tuning to better follow code-related prompts.
+- Training recipe: with a newly introduced pad token, we train this model with fixed length conditionally on [OpenCoder-SFT](https://huggingface.co/datasets/OpenCoder-LLM/opc-sft-stage2) data for 5 epochs.
+- Benchmarks: Demonstrates stronger instruction-following capabilities than the Base model.
+#### More details and usage examples:
+- Paper: [DiffuCoder: Understanding and Improving Masked Diffusion Models for Code Generation](https://arxiv.org/abs/2506.20639)
+- GitHub: https://github.com/apple/ml-diffucoder
+#### Acknowledgement
+To power this HuggingFace model release, we reuse [Dream](https://huggingface.co/Dream-org/Dream-v0-Base-7B)'s modeling architecture and generation utils.