Safetensors
llama
catherinearnett commited on
Commit
5a4a04a
·
verified ·
1 Parent(s): 8273b94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -9
README.md CHANGED
@@ -12,12 +12,16 @@ language:
12
  - nl
13
  - pl
14
  ---
15
- **Pleias-1b-Preview** is an early preview of a 1.21 billion parameters base model trained by Pleias with Tracto AI on Common Corpus.
 
 
16
 
17
- Like all the base and specialized models from Pleias, Pleias-1b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
 
 
18
 
19
  ## Description
20
- Pleias-1b-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
21
 
22
  It includes the following features, that would apply to any responsibly trained variant:
23
  * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
@@ -25,22 +29,22 @@ It includes the following features, that would apply to any responsibly trained
25
  * A new tokenizer designed for enhanced document processing tasks and better multilingual support.
26
  * Extremely low level of toxicity and problematic content.
27
 
28
- Pleias-1b-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
29
 
30
- Given its size, Pleias-1b-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
31
 
32
  ## Recommended use
33
- As a base model, Pleias-1b-Preview is only able to run continuation prompts.
34
 
35
  Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.2).
36
 
37
- Pleias-1b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
38
 
39
  ## Example
40
 
41
 
42
  ## Training
43
- Pleias-1b-Preview was fully pretrained on TractoAI on ISEG GPU cluster by Nebius AI on 192 h100s for 5 days. Pretraining code relied on the fork of Nanotron developed by TractoAI. We provide the complete settings as a yaml file as part of our release.
44
 
45
  Training schedule includes 518,000 steps (batch size 1,024) on over three epochs (nearly 5 trillions tokens):
46
  * A lightly filtered version of Common Corpus (1.6 trillion tokens)
@@ -48,6 +52,6 @@ Training schedule includes 518,000 steps (batch size 1,024) on over three epochs
48
  * A repeat of the previous set.
49
 
50
  ## Update
51
- Pleias-1b-Preview is currently released as an early preview.
52
 
53
  The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.
 
12
  - nl
13
  - pl
14
  ---
15
+ <div style="text-align: center;">
16
+ <img src="https://raw.githubusercontent.com/Pleias/logos/d6152d7943905da32a1e04fdfd7708ed9c7eed5e/PleIAs%201_0%20Full%20Logo%20(Black).png" style="width: 80%; margin: 0 auto; display: inline-block;"/>
17
+ </div>
18
 
19
+ **Pleias-nano-1b-Preview** is an early preview of a 1.21 billion parameters base model trained by [Pleias](https://huggingface.co/PleIAs) with [Tracto AI](https://tracto.ai/) on [Common Corpus](https://huggingface.co/datasets/PleIAs/common_corpus).
20
+
21
+ Like all the base and specialized models from Pleias, Pleias-nano-1b-Preview has only been trained on open data out of copyright (public domain) or under a permissible license.
22
 
23
  ## Description
24
+ Pleias-nano-1b-Preview is a transformer base model, entirely pretrained from scratch, using an architecture similar to Llama/GPT-Neox for easier deployment/inference.
25
 
26
  It includes the following features, that would apply to any responsibly trained variant:
27
  * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
 
29
  * A new tokenizer designed for enhanced document processing tasks and better multilingual support.
30
  * Extremely low level of toxicity and problematic content.
31
 
32
+ Pleias-nano-1b-Preview has demonstrated unusual abilities for multilingual generation in its size range. Fully supported languages include English, French, Spanish, German, Italian, Dutch, Latin and Portuguese.
33
 
34
+ Given its size, Pleias-nano-1b-Preview can run on CPU without any compression loss. We provide a first GGUF variant as part of our release.
35
 
36
  ## Recommended use
37
+ As a base model, Pleias-nano-1b-Preview is only able to run continuation prompts.
38
 
39
  Text generation is currently able to support a range of creative writing tasks in multiple European languages. For more consistent results we recommend using a low or null temperature with a slight repetition penalty (1.2).
40
 
41
+ Pleias-nano-1b-Preview has been successfully adapted for continuous pretraining and full-fine-tuning on document processing tasks such as RAG, translation or OCR correction. Given the small size of the model we do not recommend fine-tuning methods based on LORA.
42
 
43
  ## Example
44
 
45
 
46
  ## Training
47
+ Pleias-nano-1b-Preview was fully pretrained on TractoAI on ISEG GPU cluster by Nebius AI on 192 h100s for 5 days. Pretraining code relied on [the fork of Nanotron developed by TractoAI](https://github.com/tractoai/nanotron). We provide the complete settings as a yaml file as part of our release.
48
 
49
  Training schedule includes 518,000 steps (batch size 1,024) on over three epochs (nearly 5 trillions tokens):
50
  * A lightly filtered version of Common Corpus (1.6 trillion tokens)
 
52
  * A repeat of the previous set.
53
 
54
  ## Update
55
+ Pleias-nano-1b-Preview is currently released as an early preview.
56
 
57
  The model will undergo several more round of post-training to enhance reasoning capacities and fine-tunability as well as in anticipation of a generalist instruct version.