nvidia
/

GPT-2B-001

MaximumEntropy commited on Apr 17, 2023

Commit

a5f6bad

•

1 Parent(s): e428a5b

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -85,7 +85,7 @@ This model was trained on 1.1T tokens with [NeMo](https://docs.nvidia.com/deeple
 - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
 - No dropout.
 - No bias terms in all linear layers.
-- United embedding and output layers.
 ## Getting started

 - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
 - No dropout.
 - No bias terms in all linear layers.
+- Untied embedding and output layers.
 ## Getting started