--- license: apache-2.0 language: - es library_name: transformers tags: - falcon - alpaca - Transformers - gpt - PyTorch - llm - llm spanish pipeline_tag: text-generation datasets: - bertin-project/alpaca-spanish --- FALCON 7B Spanish Fine-tuned 8bit 🤗 **Dataset** The dataset is a translation to Spanish of alpaca_data_cleaned.json (a clean version of the Alpaca dataset made at Stanford) using OpenAI's gpt-3.5-turbo model. This translation was made by bertin-project. It was translated using a full-sample prompt instead of per strings, which resulted in more coherent tuples of (instruction, input, output). Dataset link: [here](https://huggingface.co/datasets/bertin-project/alpaca-spanish) **Finetuning details** To fine-tune the FALCON-7B model we used the [following code](https://github.com/AdrianBZG/LLM-distributed-finetune) to run it on a distributed cluster on AWS. You are free to use such code as a fingerprint to finetune any model as you please, as it is easily customizable.