keras
/

falcon_refinedweb_1b_en

Divyasreepat commited on Nov 15, 2024

Commit

ff35905

verified ·

1 Parent(s): 7d195ea

Update README.md with new model card content

Files changed (1) hide show

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ tags:
 - keras
 pipeline_tag: text-generation
 ---
-## Model Overview
 # Model Summary
 Falcon-RW-1B is a 1B parameters causal decoder-only model built by [TII](https://www.tii.ae/) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). The architecture of the model is adopted from the GPT-3 paper ([Brown et al., 2020](https://arxiv.org/abs/2005.14165)) but it uses ALiBi.
@@ -80,4 +80,5 @@ The architecture is adapted from the GPT-3 paper ([Brown et al., 2020](https://a
   url={https://arxiv.org/abs/2306.01116},
   year={2023}
 }
-```

 - keras
 pipeline_tag: text-generation
 ---
+### Model Overview
 # Model Summary
 Falcon-RW-1B is a 1B parameters causal decoder-only model built by [TII](https://www.tii.ae/) and trained on 350B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). The architecture of the model is adopted from the GPT-3 paper ([Brown et al., 2020](https://arxiv.org/abs/2005.14165)) but it uses ALiBi.
   url={https://arxiv.org/abs/2306.01116},
   year={2023}
 }
+```