nicholasKluge
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -1,7 +1,7 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
-
- nicholasKluge/
|
5 |
language:
|
6 |
- pt
|
7 |
metrics:
|
@@ -40,12 +40,13 @@ co2_eq_emissions:
|
|
40 |
Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a pair of small foundational models trained in Brazilian Portuguese._
|
41 |
|
42 |
TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious These models were trained by leveraging [scaling laws](https://arxiv.org/abs/2203.15556) to determine the optimal number of tokens per parameter while incorporating [preference pre-training](https://arxiv.org/abs/2112.00861).
|
|
|
43 |
## Details
|
44 |
|
45 |
- **Architecture:** a Transformer-based model pre-trained via causal language modeling
|
46 |
- **Size:** 468,239,360 parameters
|
47 |
- **Context length:** 2048 tokens
|
48 |
-
- **Dataset:** [
|
49 |
- **Language:** Portuguese
|
50 |
- **Number of steps:** 1,200,000
|
51 |
- **GPU:** 1 NVIDIA A100-SXM4-40GB
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
datasets:
|
4 |
+
- nicholasKluge/Pt-Corpus-Instruct
|
5 |
language:
|
6 |
- pt
|
7 |
metrics:
|
|
|
40 |
Given the lack of available monolingual foundational models in non-English languages and the fact that some of the most used and downloaded models by the community are those small enough to allow individual researchers and hobbyists to use them in low-resource environments, we developed the TeenyTinyLlama: _a pair of small foundational models trained in Brazilian Portuguese._
|
41 |
|
42 |
TeenyTinyLlama is a compact language model based on the Llama 2 architecture ([TinyLlama implementation](https://huggingface.co/TinyLlama)). This model is designed to deliver efficient natural language processing capabilities while being resource-conscious These models were trained by leveraging [scaling laws](https://arxiv.org/abs/2203.15556) to determine the optimal number of tokens per parameter while incorporating [preference pre-training](https://arxiv.org/abs/2112.00861).
|
43 |
+
|
44 |
## Details
|
45 |
|
46 |
- **Architecture:** a Transformer-based model pre-trained via causal language modeling
|
47 |
- **Size:** 468,239,360 parameters
|
48 |
- **Context length:** 2048 tokens
|
49 |
+
- **Dataset:** [Pt-Corpus Instruct](https://huggingface.co/datasets/nicholasKluge/Pt-Corpus-Instruct) (6.2B tokens)
|
50 |
- **Language:** Portuguese
|
51 |
- **Number of steps:** 1,200,000
|
52 |
- **GPU:** 1 NVIDIA A100-SXM4-40GB
|