nvidia
/

GPT-2B-001

MaximumEntropy commited on Apr 11, 2023

Commit

06cf5d0

1 Parent(s): 6e830ee

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -85,6 +85,7 @@ This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning
 - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
 - No dropout.
 - No bias terms in all linear layers.
 ## Getting started

 - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
 - No dropout.
 - No bias terms in all linear layers.
+- United embedding and output layers.
 ## Getting started