license: llama2 | |
datasets: | |
- pkupie/mc2_corpus | |
- togethercomputer/RedPajama-Data-1T | |
language: | |
- en | |
- bo | |
base_model: | |
- meta-llama/Llama-2-7b-hf | |
A continually pre-trained model based on Llama-2-7b-hf. | |
We use the **Tibetan texts** in MC^2 and **English texts** in RedPajama with a proportion of **4:1** for training. | |
#### Hyper-parameters: | |
* lr: 3e-5 | |
* batch size: 1M (2K*512) | |
* lr scheduler: cosine | |
* min lr: 1e-6 | |
* lr decay iters: 10240 |