tiiuae
/

falcon-11B

@@ -1,6 +1,6 @@
 # 🚀 Falcon2-11B
-**Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html], the permissive Apache 2.0-based software license which includes an (acceptable use policy)[https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html] that promotes the responsible use of AI.**
 *Paper coming soon 😊.*
@@ -49,7 +49,7 @@ For fast inference with Falcon, check-out [Text Generation Inference](https://gi
 - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
 - **Model type:** Causal decoder-only
 - **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
-- **License:** (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html]
 ### Model Source
@@ -190,7 +190,7 @@ Falcon2-11B was trained on AWS SageMaker, using on average 1024 A100 40GB GPUs i
 #### Software
-Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2.
 ## Citation
@@ -198,7 +198,7 @@ Falcon2-11B was trained a custom distributed training codebase, Gigatron. It use
 ## License
-Falcon2-11B is licenced under (TII Falcon License 2.0)[https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html], the permissive Apache 2.0-based software license which includes an (acceptable use policy)[https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html] that promotes the responsible use of AI.
 ## Contact
 [email protected]

 # 🚀 Falcon2-11B
+**Falcon2-11B is a 11B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained over 5,000B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. The model is made available under the [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.**
 *Paper coming soon 😊.*
 - **Developed by:** [https://www.tii.ae](https://www.tii.ae)
 - **Model type:** Causal decoder-only
 - **Language(s) (NLP):** English, German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish
+- **License:** [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html)
 ### Model Source
 #### Software
+Falcon2-11B was trained a custom distributed training codebase, Gigatron. It uses a 3D parallelism approach combined with ZeRO, high-performance Triton kernels and FlashAttention-2. More details about the distributed training strategy can be found in [Almazrouei et.al](https://arxiv.org/abs/2311.16867).
 ## Citation
 ## License
+Falcon2-11B is licenced under [TII Falcon License 2.0](https://falconllm-staging.tii.ae/falcon-2-terms-and-conditions.html), the permissive Apache 2.0-based software license which includes an [acceptable use policy](https://falconllm-staging.tii.ae/falcon-2-acceptable-use-policy.html) that promotes the responsible use of AI.
 ## Contact
 [email protected]