BSC-LT
/

ALIA-40b

@@ -48,16 +48,14 @@ language:
 >
 > The weights will be promptly updated as soon as the training process is complete.
-# Salamandra ALIA Model Card
-Salamandra ALIA is a highly multilingual model pre-trained from scratch that comes in three different
-sizes — 2B, 7B and 40B parameters — with their respective base and instruction-tuned variants.
-This model card corresponds to the 40B base version.
-To visit the model cards of other Salamandra ALIA versions, please refer to the [Model Index](#model-index).
-The entire Salamandra ALIA family is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
-Along with the open weights, all training scripts and configuration files are made publicly available in [this GitHub repository](https://github.com/langtech-bsc/salamandra).
 ---
@@ -70,7 +68,7 @@ The pre-training corpus contains text in 35 European languages and code.
 ### Hyperparameters
-The full list of hyperparameters for each model can be found [here](https://github.com/langtech-bsc/salamandra/tree/main/configs).
 ### Architecture
@@ -145,7 +143,7 @@ This section offers examples of how to perform inference using various methods.
 You'll find different techniques for running inference, including Huggingface's Text Generation Pipeline, multi-GPU configurations, and vLLM for scalable and efficient generation.
 #### Inference with Huggingface's Text Generation Pipeline
-The Huggingface Text Generation Pipeline provides a straightforward way to run inference using the Salamandra-40b model.
 ```bash
 pip install transformers torch accelerate sentencepiece protobuf
@@ -156,7 +154,7 @@ pip install transformers torch accelerate sentencepiece protobuf
 ```python
 from transformers import pipeline, set_seed
-model_id = "BSC-LT/salamandra-40b"
 # Sample prompts
 prompts = [
@@ -202,7 +200,7 @@ pip install transformers torch accelerate sentencepiece protobuf
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
-model_id = "BSC-LT/salamandra-40b"
 # Input text
 text = "El mercat del barri és"
@@ -246,7 +244,7 @@ pip install vllm
 ```python
 from vllm import LLM, SamplingParams
-model_id = "BSC-LT/salamandra-40b"
 # Sample prompts
 prompts = [
@@ -444,7 +442,7 @@ We provide an extense Datasheet section following the best practices defined by
 **For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that needed to be filled? Please provide a description.**
-The purpose of creating this dataset is to pre-train the Salamandra family of multilingual models with high performance in a large number of
 European languages (35) and code (including 92 different programming languages). In addition, we aim to represent especially the co-official
 languages of Spain: Spanish, Catalan, Galician, and Basque. This is the reason why we carry out an oversampling of these languages.
@@ -630,7 +628,7 @@ and the [Ungoliant](https://github.com/oscar-project/ungoliant) pipeline was use
 **Has the dataset been used for any tasks already? If so, please provide a description.**
-Pre-train the Salamandra model family.
 **What (other) tasks could the dataset be used for?**
@@ -1066,4 +1064,4 @@ Technical report coming soon.
 |:---:|:---:|:---:|
 |2B| [Link](https://huggingface.co/BSC-LT/salamandra-2b) | [Link](https://huggingface.co/BSC-LT/salamandra-2b-instruct) |
 |7B| [Link](https://huggingface.co/BSC-LT/salamandra-7b) | [Link](https://huggingface.co/BSC-LT/salamandra-7b-instruct) |
-|40B| [Link](https://huggingface.co/BSC-LT/salamandra-40b) | WiP |

 >
 > The weights will be promptly updated as soon as the training process is complete.
+# ALIA-40b Model Card
+ALIA-40b is a highly multilingual model pre-trained from scratch that will come with its respective base and instruction-tuned variants.
+To visit the model cards of other ALIA versions, please refer to the [Model Index](#model-index).
+This model is released under a permissive [Apache 2.0 license]((https://www.apache.org/licenses/LICENSE-2.0)).
+Along with the open weights, all training scripts and configuration files are made publicly available in [this GitHub repository](https://github.com/langtech-bsc/alia).
 ---
 ### Hyperparameters
+The full list of hyperparameters can be found [here](https://github.com/langtech-bsc/alia/blob/main/configs/bsc_40b.yaml).
 ### Architecture
 You'll find different techniques for running inference, including Huggingface's Text Generation Pipeline, multi-GPU configurations, and vLLM for scalable and efficient generation.
 #### Inference with Huggingface's Text Generation Pipeline
+The Huggingface Text Generation Pipeline provides a straightforward way to run inference using the ALIA-40b model.
 ```bash
 pip install transformers torch accelerate sentencepiece protobuf
 ```python
 from transformers import pipeline, set_seed
+model_id = "BSC-LT/ALIA-40b"
 # Sample prompts
 prompts = [
 from transformers import AutoTokenizer, AutoModelForCausalLM
 import torch
+model_id = "BSC-LT/ALIA-40b"
 # Input text
 text = "El mercat del barri és"
 ```python
 from vllm import LLM, SamplingParams
+model_id = "BSC-LT/ALIA-40b"
 # Sample prompts
 prompts = [
 **For what purpose was the dataset created? Was there a specific task in mind? Was there a specific gap that needed to be filled? Please provide a description.**
+The purpose of creating this dataset is to pre-train a family of multilingual models with high performance in a large number of
 European languages (35) and code (including 92 different programming languages). In addition, we aim to represent especially the co-official
 languages of Spain: Spanish, Catalan, Galician, and Basque. This is the reason why we carry out an oversampling of these languages.
 **Has the dataset been used for any tasks already? If so, please provide a description.**
+Pre-train the ALIA model and the Salamandra model family.
 **What (other) tasks could the dataset be used for?**
 |:---:|:---:|:---:|
 |2B| [Link](https://huggingface.co/BSC-LT/salamandra-2b) | [Link](https://huggingface.co/BSC-LT/salamandra-2b-instruct) |
 |7B| [Link](https://huggingface.co/BSC-LT/salamandra-7b) | [Link](https://huggingface.co/BSC-LT/salamandra-7b-instruct) |
+|40B| [Link](https://huggingface.co/BSC-LT/ALIA-40b) | WiP |