PleIAs
/

Pleias-Nano

Safetensors

llama

Model card Files Files and versions Community

anastasiastasenko commited on Dec 3, 2024

Commit

95df05e

verified ·

1 Parent(s): a6dc275

Update README.md

Browse files

Files changed (1) hide show

README.md +4 -4

README.md CHANGED Viewed

@@ -17,7 +17,7 @@ language:
 Similarly to its base model, Pleias-1b, Pleias-1b-RAG 0.1 aims to be a fully open model (weights, code, data), only trained on content with a permissible license and fully compliant with the upcoming European AI Act.
 ## Description
-PleIAs-1b-RAG is continuous pretraining of Pleias-3b on a new dataset of 45,088,768,000 tokens modeling common retrieval tasks. All the content of the dataset is ultimately coming from Common Corpus.
 Pleias-1b-RAG includes the main features of the original base model:
 * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
@@ -30,12 +30,12 @@ Pleias-1b-RAG supports retrieval-augmented generation with enhanced verifiabilit
 * Source analysis/criticism which also acts as an integrated reranker step.
 * Generation of ground answers with references and excerpts linked to the original sources.
-While the base model Pleias-1b-RAG has been made available as an experimental preview, we release Pleias-3b-RAG 0.1 as an early version. Pleias-3b-RAG 0.1 has been already tested and integrated into multiple applied RAG projects, including Pleias flagship application Scholastikai.
 ## Training
-PleIAs-1b-RAG was trained with Tracto AI on Nanotron, the pretraining library from HuggingFace. We provide the complete settings as a yaml file as part of our release.
-PleIAs-1b-RAG derives from the last checkpoint of PleIAs-3b (369,000). The training schedule reused the last learning rate value (5e-6) without decay for 43,000 steps. Each step is about 10 time smaller than the original steps from the base model training (roughly 1M tokens per step vs. 12M tokens)
 Training covers the entire RAG dataset we have been designing out of Common Corpus for 3 epochs.

 Similarly to its base model, Pleias-1b, Pleias-1b-RAG 0.1 aims to be a fully open model (weights, code, data), only trained on content with a permissible license and fully compliant with the upcoming European AI Act.
 ## Description
+PleIAs-1b-RAG is continuous pretraining of Pleias-1b on a new dataset of 45,088,768,000 tokens modeling common retrieval tasks. All the content of the dataset is ultimately coming from Common Corpus.
 Pleias-1b-RAG includes the main features of the original base model:
 * Only trained on open data under a permissible license and in compliance with the European AI Act. By design, all Pleias model are unable to output copyrighted content.
 * Source analysis/criticism which also acts as an integrated reranker step.
 * Generation of ground answers with references and excerpts linked to the original sources.
+While the base model Pleias-1b-RAG has been made available as an experimental preview, we release Pleias-1b-RAG 0.1 as an early version. Pleias-3b-RAG 0.1 has been already tested and integrated into multiple applied RAG projects, including Pleias flagship application Scholastikai.
 ## Training
+PleIAs-1b-RAG was trained pretrained on TractoAI on ISEG GPU cluster by Nebius AI on the fork Nanotron developed by TractoAI. We provide the complete settings as a yaml file as part of our release.
+PleIAs-1b-RAG derives from the last checkpoint of PleIAs-1b (369,000). The training schedule reused the last learning rate value (5e-6) without decay for 43,000 steps. Each step is about 10 time smaller than the original steps from the base model training (roughly 1M tokens per step vs. 12M tokens)
 Training covers the entire RAG dataset we have been designing out of Common Corpus for 3 epochs.