--- language: - en - es - pt tags: - falcon3 license: other license_name: falcon-llm-license license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html --- # Table of Contents 0. [TL;DR](#TL;DR) 1. [Model Details](#model-details) 2. [Usage](#usage) 3. [Training Details](#training-details) 4. [Evaluation](#evaluation) # TL;DR # Model Details ⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.** ## Model Description - **Developed by:** [https://www.tii.ae](https://www.tii.ae) - **Model type:** Causal decoder-only - **Architecture:** Transformer-base - **Language(s) (NLP):** Mainly English - **License:** TII Falcon-LLM License 2.0
# Usage Find below some example scripts on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source): ## Using the Pytorch model with 🤗 transformers ### Running the model on a CPU
Click to expand ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
### Running the model on a GPU
Click to expand ```python # pip install accelerate from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base", device_map="auto") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
### Running the model on a GPU using `torch.compile`
Click to expand ```python import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Base") model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Base", torch_dtype=torch.bfloat16).to(0) model = torch.compile(model) input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
# Training Details ## Training Data Falcon3-7B is trained on 15 Gigatokens of datasets comprising of web, code, STEM, high quality and mutlilingual data. ## Training Procedure Falcon3-7B is trained on 256 H100 nodes (world size 2048). ### Training Hyperparameters | **Hyperparameter** | **Value** | **Comment** | |--------------------|------------|---------------------------------------| | Precision | `bfloat16` | | | Optimizer | AdamW | | | Max learning rate | 6e-4 | Following a WSD (warmup-stable-decay) | | | | learning rate scheduler | | Weight decay | 1e-1 | | | z-loss | 1e-4 | | | Batch size | Variable | Batch size was gradually increased | | | | during the training | # Evaluation
Category Benchmark Llama-3.2-1B Qwen2.5-1.5B SmolLM2-1.7B gemma-2-2b Falcon3-1B-Base
General MMLU (5-shot) 31.1 61.0 50.1 53.0 42.5
MMLU-PRO (5-shot) 11.7 28.4 21.3 22.1 16.1
IFEval 14.8 26.0 24.2 20.3 25.2
Math GSM8K (5-shot) 6.6 62.2 31.0 25.5 34.3
MATH Lvl-5 (4-shot) 0.2 6.7 1.4 2.6 2.2
Reasoning Arc Challenge (25-shot) 40.2 54.8 54.1 53.7 48.1
GPQA (0-shot) 24.2 28.1 28.9 25.5 28.1
MUSR (0-shot) 34.5 35.5 34.7 42.7 41.9
BBH (3-shot) 31.2 41.1 34.2 36.8 36.0
CommonSense Understanding PIQA (0-shot) 74.5 76.0 77.5 79.2 74.5
SciQ (0-shot) 88.5 93.1 90.8 95.7 91.1
Winogrande (0-shot) 60.4 63.0 66.1 68.6 61.2
OpenbookQA (0-shot) 37.4 40.4 44.0 41.8 41.0
# Citation