Update READM.md
Browse files
README.md
CHANGED
@@ -7,23 +7,17 @@ licence: apache-2.0
|
|
7 |
tags:
|
8 |
- spanish
|
9 |
- catalan
|
10 |
-
-
|
11 |
datasets:
|
12 |
- BSC-LT/open_data_26B_tokens_balanced_es_ca
|
13 |
metrics:
|
14 |
- ppl
|
15 |
model-index:
|
16 |
-
- name:
|
17 |
results:
|
18 |
- task:
|
19 |
name: Causal Language Modeling
|
20 |
type: text-generation
|
21 |
-
dataset:
|
22 |
-
name: BSC-LT/open_data_26B_tokens_balanced_es_ca
|
23 |
-
type: Causal Language Modeling
|
24 |
-
config: default
|
25 |
-
split: validation
|
26 |
-
args: default
|
27 |
metrics:
|
28 |
- name: Perplexity
|
29 |
type: ppl
|
@@ -71,7 +65,7 @@ license: apache-2.0
|
|
71 |
pipeline_tag: text-generation
|
72 |
---
|
73 |
|
74 |
-
#
|
75 |
|
76 |
## Table of Contents
|
77 |
<details>
|
@@ -97,12 +91,12 @@ pipeline_tag: text-generation
|
|
97 |
|
98 |
## Model description
|
99 |
|
100 |
-
The
|
101 |
|
102 |
|
103 |
## Intended uses and limitations
|
104 |
|
105 |
-
The
|
106 |
|
107 |
## How to use
|
108 |
|
@@ -114,7 +108,7 @@ import transformers
|
|
114 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
115 |
|
116 |
input_text = "Maria y Miguel no tienen ningún "
|
117 |
-
model = "
|
118 |
tokenizer = AutoTokenizer.from_pretrained(model)
|
119 |
|
120 |
pipeline = transformers.pipeline(
|
|
|
7 |
tags:
|
8 |
- spanish
|
9 |
- catalan
|
10 |
+
- aguila-7b
|
11 |
datasets:
|
12 |
- BSC-LT/open_data_26B_tokens_balanced_es_ca
|
13 |
metrics:
|
14 |
- ppl
|
15 |
model-index:
|
16 |
+
- name: aguila_7b
|
17 |
results:
|
18 |
- task:
|
19 |
name: Causal Language Modeling
|
20 |
type: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
metrics:
|
22 |
- name: Perplexity
|
23 |
type: ppl
|
|
|
65 |
pipeline_tag: text-generation
|
66 |
---
|
67 |
|
68 |
+
# Ǎguila-7B
|
69 |
|
70 |
## Table of Contents
|
71 |
<details>
|
|
|
91 |
|
92 |
## Model description
|
93 |
|
94 |
+
The **Ǎguila-7B** is a transformer-based causal language model for Catalan, Spanish, and English. It is based on the [Falcon-7B](https://huggingface.co/tiiuae/falcon-7b) model and has been trained on a 26B token trilingual corpus collected from publicly available corpora and crawlers.
|
95 |
|
96 |
|
97 |
## Intended uses and limitations
|
98 |
|
99 |
+
The **Ǎguila-7B** model is ready-to-use only for causal language modeling to perform text-generation tasks. However, it is intended to be fine-tuned on a generative downstream task.
|
100 |
|
101 |
## How to use
|
102 |
|
|
|
108 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
109 |
|
110 |
input_text = "Maria y Miguel no tienen ningún "
|
111 |
+
model = "projecte-aina/aguila-7b"
|
112 |
tokenizer = AutoTokenizer.from_pretrained(model)
|
113 |
|
114 |
pipeline = transformers.pipeline(
|