nicolasdec
commited on
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,117 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- pt
|
4 |
+
- en
|
5 |
+
license: cc
|
6 |
+
tags:
|
7 |
+
- text-generation-inference
|
8 |
+
- transformers
|
9 |
+
- mistral
|
10 |
+
- mixtral
|
11 |
+
- gguf
|
12 |
+
- brazil
|
13 |
+
- brasil
|
14 |
+
- portuguese
|
15 |
+
|
16 |
+
---
|
17 |
+
# BotBot Cabra Mixtral 8x7b
|
18 |
+
|
19 |
+
Esse modelo é um finetune do [Mixtral 8x7b](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) com o dataset interno Cabra 10k. Esse modelo é optimizado para português. Ele apresenta melhoria em varios benchmarks brasileiros em comparação com o modelo base.
|
20 |
+
|
21 |
+
|
22 |
+
**Conheça os nossos outros modelos: [Cabra](https://huggingface.co/collections/botbot-ai/models-6604c2069ceef04f834ba99b).**
|
23 |
+
|
24 |
+
### dataset: Cabra 30k
|
25 |
+
|
26 |
+
Dataset interno para finetuning. Vamos lançar em breve.
|
27 |
+
|
28 |
+
### Quantização / GGUF
|
29 |
+
|
30 |
+
Colocamos diversas versões (GGUF) quantanizadas no branch "quantanization".
|
31 |
+
|
32 |
+
### Exemplo
|
33 |
+
|
34 |
+
```
|
35 |
+
<s> [INST] who is Elon Musk? [/INST]Elon Musk é um empreendedor, inventor e capitalista americano. Ele é o fundador, CEO e CTO da SpaceX, CEO da Neuralink e fundador do The Boring Company. Musk também é o proprietário do Twitter.</s>
|
36 |
+
|
37 |
+
```
|
38 |
+
|
39 |
+
## Uso
|
40 |
+
O modelo é destinado, por agora, a fins de pesquisa. As áreas e tarefas de pesquisa possíveis incluem:
|
41 |
+
|
42 |
+
- Pesquisa sobre modelos gerativos.
|
43 |
+
- Investigação e compreensão das limitações e viéses de modelos gerativos.
|
44 |
+
|
45 |
+
**Proibido para uso comercial. Somente Pesquisa.**
|
46 |
+
|
47 |
+
### Evals
|
48 |
+
|
49 |
+
|
50 |
+
Tasks Version Filter n-shot Metric Value Stderr
|
51 |
+
assin2_rte 1.1 all 15 f1_macro 0.9095 ± 0.0041
|
52 |
+
all 15 acc 0.9097 ± 0.0041
|
53 |
+
assin2_sts 1.1 all 15 pearson 0.7763 ± 0.0068
|
54 |
+
all 15 mse 0.4610 ± N/A
|
55 |
+
bluex 1.1 all 3 acc 0.6412 ± 0.0103
|
56 |
+
exam_id__UNICAMP_2021_2 3 acc 0.5882 ± 0.0397
|
57 |
+
exam_id__USP_2023 3 acc 0.7045 ± 0.0397
|
58 |
+
exam_id__UNICAMP_2020 3 acc 0.6545 ± 0.0371
|
59 |
+
exam_id__UNICAMP_2023 3 acc 0.7442 ± 0.0384
|
60 |
+
exam_id__UNICAMP_2018 3 acc 0.5926 ± 0.0386
|
61 |
+
exam_id__UNICAMP_2022 3 acc 0.6154 ± 0.0451
|
62 |
+
exam_id__USP_2019 3 acc 0.6250 ± 0.0442
|
63 |
+
exam_id__USP_2021 3 acc 0.6346 ± 0.0387
|
64 |
+
exam_id__UNICAMP_2021_1 3 acc 0.5000 ± 0.0427
|
65 |
+
exam_id__USP_2022 3 acc 0.6531 ± 0.0393
|
66 |
+
exam_id__UNICAMP_2024 3 acc 0.5556 ± 0.0428
|
67 |
+
exam_id__USP_2024 3 acc 0.8537 ± 0.0319
|
68 |
+
exam_id__USP_2018 3 acc 0.6296 ± 0.0379
|
69 |
+
exam_id__USP_2020 3 acc 0.6071 ± 0.0375
|
70 |
+
exam_id__UNICAMP_2019 3 acc 0.7000 ± 0.0374
|
71 |
+
enem 1.1 all 3 acc 0.7810 ± 0.0063
|
72 |
+
exam_id__2013 3 acc 0.7685 ± 0.0236
|
73 |
+
exam_id__2010 3 acc 0.8205 ± 0.0205
|
74 |
+
exam_id__2012 3 acc 0.8276 ± 0.0202
|
75 |
+
exam_id__2016_2 3 acc 0.7886 ± 0.0213
|
76 |
+
exam_id__2016 3 acc 0.8017 ± 0.0209
|
77 |
+
exam_id__2022 3 acc 0.6541 ± 0.0238
|
78 |
+
exam_id__2009 3 acc 0.8087 ± 0.0212
|
79 |
+
exam_id__2015 3 acc 0.8067 ± 0.0208
|
80 |
+
exam_id__2017 3 acc 0.7759 ± 0.0223
|
81 |
+
exam_id__2014 3 acc 0.7798 ± 0.0230
|
82 |
+
exam_id__2023 3 acc 0.7037 ± 0.0228
|
83 |
+
exam_id__2011 3 acc 0.8632 ± 0.0183
|
84 |
+
faquad_nli 1.1 all 15 f1_macro 0.7893 ± 0.0137
|
85 |
+
all 15 acc 0.8554 ± 0.0097
|
86 |
+
hatebr_offensive_binary 1.0 all 25 f1_macro 0.7800 ± 0.0080
|
87 |
+
all 25 acc 0.7879 ± 0.0077
|
88 |
+
oab_exams 1.5 all 3 acc 0.5549 ± 0.0061
|
89 |
+
exam_id__2014-14 3 acc 0.6375 ± 0.0311
|
90 |
+
exam_id__2010-01 3 acc 0.4471 ± 0.0312
|
91 |
+
exam_id__2016-20a 3 acc 0.5000 ± 0.0323
|
92 |
+
exam_id__2013-10 3 acc 0.5750 ± 0.0318
|
93 |
+
exam_id__2010-02 3 acc 0.5500 ± 0.0288
|
94 |
+
exam_id__2011-04 3 acc 0.5750 ± 0.0319
|
95 |
+
exam_id__2013-12 3 acc 0.6000 ± 0.0316
|
96 |
+
exam_id__2017-22 3 acc 0.6000 ± 0.0316
|
97 |
+
exam_id__2015-16 3 acc 0.5625 ± 0.0320
|
98 |
+
exam_id__2012-09 3 acc 0.4416 ± 0.0326
|
99 |
+
exam_id__2016-19 3 acc 0.5641 ± 0.0326
|
100 |
+
exam_id__2017-23 3 acc 0.5500 ± 0.0321
|
101 |
+
exam_id__2013-11 3 acc 0.5500 ± 0.0321
|
102 |
+
exam_id__2012-08 3 acc 0.5500 ± 0.0323
|
103 |
+
exam_id__2011-03 3 acc 0.5152 ± 0.0290
|
104 |
+
exam_id__2015-17 3 acc 0.6538 ± 0.0310
|
105 |
+
exam_id__2015-18 3 acc 0.6125 ± 0.0314
|
106 |
+
exam_id__2014-15 3 acc 0.6538 ± 0.0310
|
107 |
+
exam_id__2018-25 3 acc 0.5750 ± 0.0320
|
108 |
+
exam_id__2012-06a 3 acc 0.5750 ± 0.0318
|
109 |
+
exam_id__2017-24 3 acc 0.4875 ± 0.0323
|
110 |
+
exam_id__2011-05 3 acc 0.5250 ± 0.0323
|
111 |
+
exam_id__2012-06 3 acc 0.5750 ± 0.0319
|
112 |
+
exam_id__2016-21 3 acc 0.4875 ± 0.0322
|
113 |
+
exam_id__2012-07 3 acc 0.5000 ± 0.0323
|
114 |
+
exam_id__2016-20 3 acc 0.5500 ± 0.0322
|
115 |
+
exam_id__2014-13 3 acc 0.5875 ± 0.0316
|
116 |
+
portuguese_hate_speech_binary 1.0 all 25 f1_macro 0.6954 ± 0.0114
|
117 |
+
all 25 acc 0.7086 ± 0.0110
|