nicholasKluge
commited on
Update README.md
Browse files
README.md
CHANGED
@@ -143,26 +143,26 @@ trainer.train()
|
|
143 |
|
144 |
## Fine-Tuning Comparisons
|
145 |
|
146 |
-
|
147 |
-
|
148 |
-
|
|
149 |
-
|
150 |
-
|
|
151 |
-
|
|
152 |
-
|
|
|
|
|
|
|
|
153 |
|
154 |
## Cite as 🤗
|
155 |
|
156 |
```latex
|
157 |
|
158 |
-
@misc{
|
159 |
-
|
160 |
-
|
161 |
-
|
162 |
-
|
163 |
-
year = {2023},
|
164 |
-
publisher = {HuggingFace},
|
165 |
-
journal = {HuggingFace repository},
|
166 |
}
|
167 |
|
168 |
```
|
|
|
143 |
|
144 |
## Fine-Tuning Comparisons
|
145 |
|
146 |
+
To further evaluate the downstream capabilities of our models, we decided to employ a basic fine-tuning procedure for our TTL pair on a subset of tasks from the Poeta benchmark. We apply the same procedure for comparison purposes on both [BERTimbau](https://huggingface.co/neuralmind/bert-base-portuguese-cased) models, given that they are also LLM trained from scratch in Brazilian Portuguese and have a similar size range to our models. We used these comparisons to assess if our pre-training runs produced LLM capable of producing good results ("good" here means "close to BERTimbau") when utilized for downstream applications.
|
147 |
+
|
148 |
+
| Models | IMDB | FaQuAD-NLI | HateBr | Assin2 | AgNews | Average |
|
149 |
+
|-----------------|-----------|------------|-----------|-----------|-----------|---------|
|
150 |
+
| BERTimbau-large | **93.58** | 92.26 | 91.57 | **88.97** | 94.11 | 92.10 |
|
151 |
+
| BERTimbau-small | 92.22 | **93.07** | 91.28 | 87.45 | 94.19 | 91.64 |
|
152 |
+
| **TTL-460m** | 91.64 | 91.18 | **92.28** | 86.43 | **94.42** | 91.19 |
|
153 |
+
| **TTL-160m** | 91.14 | 90.00 | 90.71 | 85.78 | 94.05 | 90.34 |
|
154 |
+
|
155 |
+
All the shown results are the higher accuracy scores achieved on the respective task test sets after fine-tuning the models on the training sets. All fine-tuning runs used the same hyperparameters, and the code implementation can be found in the [model cards](https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m-HateBR) of our fine-tuned models.
|
156 |
|
157 |
## Cite as 🤗
|
158 |
|
159 |
```latex
|
160 |
|
161 |
+
@misc{correa24ttllama,
|
162 |
+
title = {TeenyTinyLlama: a pair of open-source tiny language models trained in Brazilian Portuguese},
|
163 |
+
author = {Corr{\^e}a, Nicholas Kluge and Falk, Sophia and Fatimah, Shiza and Sen, Aniket and De Oliveira, Nythamar},
|
164 |
+
journal={arXiv},
|
165 |
+
year = {2024},
|
|
|
|
|
|
|
166 |
}
|
167 |
|
168 |
```
|