NeMo
PyTorch
text generation
causal-lm
MaximumEntropy commited on
Commit
06cf5d0
·
1 Parent(s): 6e830ee

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -85,6 +85,7 @@ This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning
85
  - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
86
  - No dropout.
87
  - No bias terms in all linear layers.
 
88
 
89
  ## Getting started
90
 
 
85
  - Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
86
  - No dropout.
87
  - No bias terms in all linear layers.
88
+ - United embedding and output layers.
89
 
90
  ## Getting started
91