Update README.md
Browse files
README.md
CHANGED
@@ -17,10 +17,7 @@ Gale comprises three decoder-only transformer models derived from [Mistral](http
|
|
17 |
| [Gale-Medium](https://hf.co/crumbly/Gale-medium) | 3B | 13/32 |
|
18 |
| [Gale-Small](https://hf.co/crumbly/Gale-small) | 1B | 4/32 |
|
19 |
|
20 |
-
|
21 |
-
## Horizon Dataset
|
22 |
-
|
23 |
-
The dataset used to train the Gale models consists of updated English text and code to fine-tune models like Dante which need to "set" their architectural changes in place. It's an efficient approach to leverage prior model knowledge instead of starting from scratch.
|
24 |
|
25 |
| Subset | Token % |
|
26 |
| --- | --- |
|
|
|
17 |
| [Gale-Medium](https://hf.co/crumbly/Gale-medium) | 3B | 13/32 |
|
18 |
| [Gale-Small](https://hf.co/crumbly/Gale-small) | 1B | 4/32 |
|
19 |
|
20 |
+
The Crumbly 'Horizon' dataset used to train the Gale models consists of updated English text and code to fine-tune models like Gale which need to "set" their architectural changes in place. It's an efficient approach to leverage prior model knowledge instead of starting from scratch.
|
|
|
|
|
|
|
21 |
|
22 |
| Subset | Token % |
|
23 |
| --- | --- |
|