bigcode
/

starcoder2-15b

Text Generation

text-generation-inference

Model card Files Files and versions Community

lvwerra HF Staff commited on Feb 27, 2024

Commit

db9afcd

·

verified ·

1 Parent(s): 0115eb7

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -33,7 +33,7 @@ tags:
 StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2),  and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
 - **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
-- **Paper:** TODO
 - **Point of Contact:** [[email protected]](mailto:[email protected])
 - **Languages:** 600+ Programming languages
@@ -148,4 +148,4 @@ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can
 # Citation
-TODO

 StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2),  and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
 - **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
+- **Paper:** [Link](https://huggingface.co/datasets/bigcode/the-stack-v2/)
 - **Point of Contact:** [[email protected]](mailto:[email protected])
 - **Languages:** 600+ Programming languages
 # Citation
+_Coming soon_