Triangle104 commited on
Commit
850bfd6
·
verified ·
1 Parent(s): 6e72af6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -0
README.md CHANGED
@@ -18,6 +18,30 @@ library_name: transformers
18
  This model was converted to GGUF format from [`tiiuae/Falcon3-Mamba-7B-Instruct`](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) for more details on the model.
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ## Use with llama.cpp
22
  Install llama.cpp through brew (works on Mac and Linux)
23
 
 
18
  This model was converted to GGUF format from [`tiiuae/Falcon3-Mamba-7B-Instruct`](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
19
  Refer to the [original model card](https://huggingface.co/tiiuae/Falcon3-Mamba-7B-Instruct) for more details on the model.
20
 
21
+ ---
22
+ Model details:
23
+ -
24
+ Falcon3 family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B.
25
+
26
+ This repository contains the Falcon3-Mamba-7B-Instruct. It achieves, compared to similar SSM-based models of the same size, state of art results (at release's time) on reasoning, language understanding, instruction following, code and mathematics tasks. Falcon3-Mamba-7B-Instruct supports a context length up to 32K and was mainly trained on english corpus.
27
+
28
+ Model Details
29
+
30
+ Architecture (same as Falcon-Mamba-7b)
31
+
32
+ Mamba1 based causal decoder only architecture trained on a causal language modeling task (i.e., predict the next token).
33
+ 64 decoder blocks
34
+ width: 4096
35
+ state_size: 16
36
+ 32k context length
37
+ 65k vocab size
38
+ Continue Pretrained from Falcon-Mamba-7b, with another 1500 Gigatokens of data consisting of web, code, STEM and high quality data.
39
+ Postrained on 1.2 million samples of STEM, conversations, code, and safety.
40
+ Developed by Technology Innovation Institute
41
+ License: TII Falcon-LLM License 2.0
42
+ Model Release Date: December 2024
43
+
44
+ ---
45
  ## Use with llama.cpp
46
  Install llama.cpp through brew (works on Mac and Linux)
47