kernelmachine
/

silo-pd-1.3b

Text Generation

Inference Endpoints

Model card Files Files and versions Community

kernelmachine commited on Aug 8, 2023

Commit

eda8816

·

1 Parent(s): 5cb8c21

Update README.md

Files changed (1) hide show

README.md +2 -4

README.md CHANGED Viewed

@@ -30,17 +30,15 @@ pip install xformers
 ### Model Description
-Silo-PD is a 1.3B parameter, decoder-only language model trained on public domain data from [the Open License Corpus (OLC)](https://huggingface.co/datasets/kernelmachine/open-license-corpus).
-We use 1.3B-parameter transformer LMs based on the LLaMA architecture as implemented in OpenLM.
 The model is trained with 128 A100 GPUs across 16 nodes.
 ### Model and Training Hyperparameters
-The following reports the hyperparameters for the parametric component of Silo-PD.
 We follow the model architecture of LLaMa, and we use the GPT-NeoX-20B tokenizer, with 50432 BPE types.
 During training, we use 2,048 token sequences that are packed across document boundaries, and we pre-pend a beginning-of-text token to every document.

 ### Model Description
+Silo-PD is a 1.3B parameter, decoder-only language model trained on data in the public domain from [the Open License Corpus (OLC)](https://huggingface.co/datasets/kernelmachine/open-license-corpus).
+The model is based on the LLaMA architecture as implemented in (OpenLM)[].
 The model is trained with 128 A100 GPUs across 16 nodes.
 ### Model and Training Hyperparameters
 We follow the model architecture of LLaMa, and we use the GPT-NeoX-20B tokenizer, with 50432 BPE types.
 During training, we use 2,048 token sequences that are packed across document boundaries, and we pre-pend a beginning-of-text token to every document.