Update README.md
Browse files
README.md
CHANGED
@@ -78,6 +78,7 @@ The pre-training and fine-tuning were conducted on 512 NVIDIA Ampere (64GB) GPUs
|
|
78 |
| Num. of parameters | ≈1B |
|
79 |
| Training tokens | ≈1T |
|
80 |
|Loss function |MLM + In-Context loss|
|
|
|
81 |
|
82 |
## Licence
|
83 |
The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement).
|
|
|
78 |
| Num. of parameters | ≈1B |
|
79 |
| Training tokens | ≈1T |
|
80 |
|Loss function |MLM + In-Context loss|
|
81 |
+
|Multi-layer loss | yes |
|
82 |
|
83 |
## Licence
|
84 |
The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can find the full agreement [here](https://huggingface.co/spaces/bigcode/bigcode-model-license-agreement).
|