MaximumEntropy
commited on
Commit
·
06cf5d0
1
Parent(s):
6e830ee
Update README.md
Browse files
README.md
CHANGED
@@ -85,6 +85,7 @@ This model was trained with [NeMo Megatron](https://docs.nvidia.com/deeplearning
|
|
85 |
- Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
|
86 |
- No dropout.
|
87 |
- No bias terms in all linear layers.
|
|
|
88 |
|
89 |
## Getting started
|
90 |
|
|
|
85 |
- Maximum sequence length of 4,096 compared to 2,048 in https://huggingface.co/nvidia/nemo-megatron-gpt-20B.
|
86 |
- No dropout.
|
87 |
- No bias terms in all linear layers.
|
88 |
+
- United embedding and output layers.
|
89 |
|
90 |
## Getting started
|
91 |
|