Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,6 @@
|
|
1 |
# π Falcon2-11B
|
2 |
|
3 |
-
**Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the
|
4 |
|
5 |
*Paper coming soon π.*
|
6 |
|
@@ -49,7 +49,7 @@ For fast inference with Falcon, check-out [Text Generation Inference](https://gi
|
|
49 |
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
|
50 |
- **Model type:** Causal decoder-only
|
51 |
- **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
|
52 |
-
- **License:**
|
53 |
|
54 |
### Model Source
|
55 |
|
@@ -190,7 +190,7 @@ Falcon2-11B was trained on AWS SageMaker, using on average 1024 A100 40GB GPUs i
|
|
190 |
|
191 |
#### Software
|
192 |
|
193 |
-
Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2.
|
194 |
|
195 |
## Citation
|
196 |
|
@@ -198,7 +198,7 @@ Falcon2-11B was trained a custom distributed training codebase, Gigatron. It use
|
|
198 |
|
199 |
## License
|
200 |
|
201 |
-
Falcon2-11B is licenced under
|
202 |
|
203 |
## Contact
|
204 |
|
|
1 |
# π Falcon2-11B
|
2 |
|
3 |
+
**Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.**
|
4 |
|
5 |
*Paper coming soon π.*
|
6 |
|
|
|
49 |
- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
|
50 |
- **Model type:** Causal decoder-only
|
51 |
- **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
|
52 |
+
- **License:** [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html)
|
53 |
|
54 |
### Model Source
|
55 |
|
|
|
190 |
|
191 |
#### Software
|
192 |
|
193 |
+
Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2. More details about the distributed training strategy can be found in [Almazrouei et.al](https://arxiv.org/abs/2311.16867).
|
194 |
|
195 |
## Citation
|
196 |
|
|
|
198 |
|
199 |
## License
|
200 |
|
201 |
+
Falcon2-11B is licenced under [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
|
202 |
|
203 |
## Contact
|
204 |