melaseddik
commited on
Commit
•
b6bed06
1
Parent(s):
922a098
Update README.md
Browse files
README.md
CHANGED
@@ -8,19 +8,19 @@ tags:
|
|
8 |
- falcon3
|
9 |
---
|
10 |
|
11 |
-
# Falcon3-
|
12 |
|
13 |
**Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
|
14 |
|
15 |
-
This repository contains the **Falcon3-
|
16 |
-
Falcon3-
|
17 |
|
18 |
⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.**
|
19 |
|
20 |
## Model Details
|
21 |
- Architecture
|
22 |
- transformer based causal decoder only architecture
|
23 |
-
-
|
24 |
- grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
|
25 |
- wider head dimension: 256
|
26 |
- high RoPE value to support long context understanding: 1000042
|
@@ -44,7 +44,7 @@ from transformers import pipeline
|
|
44 |
|
45 |
pipe = pipeline(
|
46 |
"text-generation",
|
47 |
-
model="tiiuae/Falcon3-
|
48 |
torch_dtype=torch.bfloat16,
|
49 |
device_map="auto"
|
50 |
)
|
|
|
8 |
- falcon3
|
9 |
---
|
10 |
|
11 |
+
# Falcon3-10B-Base
|
12 |
|
13 |
**Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
|
14 |
|
15 |
+
This repository contains the **Falcon3-10B-Base**. It achieves state of art results (at release's time) on reasoning, language understanding, instruction following, code and mathematics tasks.
|
16 |
+
Falcon3-10B-Base supports 4 languages (english, french, spanish, portuguese) and a context length up to 32K.
|
17 |
|
18 |
⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.**
|
19 |
|
20 |
## Model Details
|
21 |
- Architecture
|
22 |
- transformer based causal decoder only architecture
|
23 |
+
- 40 decoder blocks
|
24 |
- grouped query attention (GQA) for faster inference: 12 query heads and 4 KV heads
|
25 |
- wider head dimension: 256
|
26 |
- high RoPE value to support long context understanding: 1000042
|
|
|
44 |
|
45 |
pipe = pipeline(
|
46 |
"text-generation",
|
47 |
+
model="tiiuae/Falcon3-10B-Base",
|
48 |
torch_dtype=torch.bfloat16,
|
49 |
device_map="auto"
|
50 |
)
|