apple
/

DiffuCoder-7B-Base

text-diffusion-model

diffusion large language model

Model card Files Files and versions Community

DiffuCoder-7B-Base / README.md

Sansa's picture

Update README.md

986a464 verified 3 days ago

|

905 Bytes

metadata

license: unknown
base_model:
  - Qwen/Qwen2.5-Coder-7B
tags:
  - code
  - text-diffusion-model
  - diffusion large language model

DiffuCoder-7B-Base

The DiffuCoder-7B-Base model is our foundational masked diffusion LLM for code generation.

Training recipe: Using DiffuLLaMA's adaptation approach, trained on a large corpus of code: with Stage 1 65B tokens and Stage 2 65B tokens.
Benchmarks: Strong baseline performance on HumanEval, MBPP and BigCodeBench.

More details and usage examples:

Acknowledgement

To power this HuggingFace model release, we reuse Dream's modeling architecture and generation utils.