Update README.md
Browse files
README.md
CHANGED
@@ -33,7 +33,7 @@ tags:
|
|
33 |
StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
|
34 |
|
35 |
- **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
|
36 |
-
- **Paper:**
|
37 |
- **Point of Contact:** [[email protected]](mailto:[email protected])
|
38 |
- **Languages:** 600+ Programming languages
|
39 |
|
@@ -148,4 +148,4 @@ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can
|
|
148 |
|
149 |
# Citation
|
150 |
|
151 |
-
|
|
|
33 |
StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
|
34 |
|
35 |
- **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
|
36 |
+
- **Paper:** [Link](https://huggingface.co/datasets/bigcode/the-stack-v2/)
|
37 |
- **Point of Contact:** [[email protected]](mailto:[email protected])
|
38 |
- **Languages:** 600+ Programming languages
|
39 |
|
|
|
148 |
|
149 |
# Citation
|
150 |
|
151 |
+
_Coming soon_
|