kernelmachine commited on
Commit
eda8816
·
1 Parent(s): 5cb8c21

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -4
README.md CHANGED
@@ -30,17 +30,15 @@ pip install xformers
30
  ### Model Description
31
 
32
 
33
- Silo-PD is a 1.3B parameter, decoder-only language model trained on public domain data from [the Open License Corpus (OLC)](https://huggingface.co/datasets/kernelmachine/open-license-corpus).
34
 
35
- We use 1.3B-parameter transformer LMs based on the LLaMA architecture as implemented in OpenLM.
36
 
37
  The model is trained with 128 A100 GPUs across 16 nodes.
38
 
39
 
40
  ### Model and Training Hyperparameters
41
 
42
- The following reports the hyperparameters for the parametric component of Silo-PD.
43
-
44
  We follow the model architecture of LLaMa, and we use the GPT-NeoX-20B tokenizer, with 50432 BPE types.
45
 
46
  During training, we use 2,048 token sequences that are packed across document boundaries, and we pre-pend a beginning-of-text token to every document.
 
30
  ### Model Description
31
 
32
 
33
+ Silo-PD is a 1.3B parameter, decoder-only language model trained on data in the public domain from [the Open License Corpus (OLC)](https://huggingface.co/datasets/kernelmachine/open-license-corpus).
34
 
35
+ The model is based on the LLaMA architecture as implemented in (OpenLM)[].
36
 
37
  The model is trained with 128 A100 GPUs across 16 nodes.
38
 
39
 
40
  ### Model and Training Hyperparameters
41
 
 
 
42
  We follow the model architecture of LLaMa, and we use the GPT-NeoX-20B tokenizer, with 50432 BPE types.
43
 
44
  During training, we use 2,048 token sequences that are packed across document boundaries, and we pre-pend a beginning-of-text token to every document.