Lloro / README.md

Update README.md

f62b8bf verified 10 months ago

3.87 kB

	---
	library_name: peft
	base_model: codellama/CodeLlama-7b-Instruct-hf
	---

	Lloro 7B


	Lloro, developed by Semantix Research Labs , is a language Model that was trained to effectively perform Portuguese Data Analysis. It is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf, that was trained on synthetic datasets . The fine-tuning process was performed using the QLORA metodology on a GPU V100 with 16 GB of RAM.



	Model description


	Model type: A 7B parameter fine-tuned on synthetic datasets.

	Language(s) (NLP): Primarily Portuguese, but the model is capable to understand English as well

	Finetuned from model: codellama/CodeLlama-7b-Instruct-hf



	What is Lloro's intended use(s)?


	Lloro is built for data analysis in Portuguese contexts .

	Input : Text

	Output : Text (Code)



	Params
	Training Parameters
	\| Params \| Training Data \| Examples \| Tokens \| LR \|
	\|----------------------------------\|---------------------------------\|---------------------------------\|----------\|--------\|
	\| 7B \| Pairs synthetic instructions/code \| 28907 \| 3 031 188 \| 1e-5 \|


	Model Sources


	Repository:https://gitlab.com/semantix-labs/generative-ai/lloroConnect

	Dataset Repository: https://gitlab.com/semantix-labs/generative-ai/lloro-datasetsConnect


	Model Dates Lloro was trained between November 2023 and January 2024.



	Performance
	\| Modelo \| LLM as Judge \| Code Bleu Score \| Rouge-L \| CodeBert- Precision \| CodeBert-Recall \| CodeBert-F1 \| CodeBert-F3 \|
	\|----------------\|--------------\|------------------\|---------\|----------------------\|-----------------\|-------------\|-------------\|
	\| GPT 3.5 \| 91.22% \| 0.2745 \| 0.2189 \| 0.7502 \| 0.7146 \| 0.7303 \| 0.7175 \|
	\| Instruct -Base \| 97.40% \| 0.2487 \| 0.1146 \| 0.6997 \| 0.6473 \| 0.6713 \| 0.6518 \|
	\| Instruct -FT \| 97.76% \| 0.3264 \| 0.3602 \| 0.7942 \| 0.8178 \| 0.8042 \| 0.8147 \|


	Training Infos:
	The following hyperparameters were used during training:

	\| Parameter \| Value \|
	\|---------------------------\|----------------------\|
	\| learning_rate \| 1e-5 \|
	\| weight_decay \| 0.0001 \|
	\| train_batch_size \| 1 \|
	\| eval_batch_size \| 1 \|
	\| seed \| 42 \|
	\| optimizer \| Adam - paged_adamw_32bit \|
	\| lr_scheduler_type \| cosine \|
	\| lr_scheduler_warmup_ratio \| 0.03 \|
	\| num_epochs \| 5.0 \|

	QLoRA hyperparameters
	The following parameters related with the Quantized Low-Rank Adaptation and Quantization were used during training:

	\| Parameter \| Value \|
	\|------------------\|---------\|
	\| lora_r \| 16 \|
	\| lora_alpha \| 64 \|
	\| lora_dropout \| 0.1 \|
	\| storage_dtype \| "nf4" \|
	\| compute_dtype \| "float16"\|


	Experiments
	\| Model \| Epochs \| Overfitting \| Final Epochs \| Training Hours \| CO2 Emission (Kg) \|
	\|-----------------------\|--------\|-------------\|--------------\|-----------------\|--------------------\|
	\| Code Llama Instruct \| 1 \| No \| 1 \| 8.1 \| 1.337 \|
	\| Code Llama Instruct \| 5 \| Yes \| 3 \| 45.6 \| 9.12 \|

	Framework versions

	\| Library \| Version \|
	\|---------------\|-----------\|
	\| bitsandbytes \| 0.40.2 \|
	\| Datasets \| 2.14.3 \|
	\| Pytorch \| 2.0.1 \|
	\| Tokenizers \| 0.14.1 \|
	\| Transformers \| 4.34.0 \|

	---
	library_name: peft
	base_model: codellama/CodeLlama-7b-Instruct-hf
	---

	Lloro 7B


	Lloro, developed by Semantix Research Labs , is a language Model that was trained to effectively perform Portuguese Data Analysis. It is a fine-tuned version of codellama/CodeLlama-7b-Instruct-hf, that was trained on synthetic datasets . The fine-tuning process was performed using the QLORA metodology on a GPU V100 with 16 GB of RAM.



	Model description


	Model type: A 7B parameter fine-tuned on synthetic datasets.

	Language(s) (NLP): Primarily Portuguese, but the model is capable to understand English as well

	Finetuned from model: codellama/CodeLlama-7b-Instruct-hf



	What is Lloro's intended use(s)?


	Lloro is built for data analysis in Portuguese contexts .

	Input : Text

	Output : Text (Code)



	Params
	Training Parameters
	\| Params \| Training Data \| Examples \| Tokens \| LR \|
	\|----------------------------------\|---------------------------------\|---------------------------------\|----------\|--------\|
	\| 7B \| Pairs synthetic instructions/code \| 28907 \| 3 031 188 \| 1e-5 \|


	Model Sources


	Repository:https://gitlab.com/semantix-labs/generative-ai/lloroConnect

	Dataset Repository: https://gitlab.com/semantix-labs/generative-ai/lloro-datasetsConnect


	Model Dates Lloro was trained between November 2023 and January 2024.



	Performance
	\| Modelo \| LLM as Judge \| Code Bleu Score \| Rouge-L \| CodeBert- Precision \| CodeBert-Recall \| CodeBert-F1 \| CodeBert-F3 \|
	\|----------------\|--------------\|------------------\|---------\|----------------------\|-----------------\|-------------\|-------------\|
	\| GPT 3.5 \| 91.22% \| 0.2745 \| 0.2189 \| 0.7502 \| 0.7146 \| 0.7303 \| 0.7175 \|
	\| Instruct -Base \| 97.40% \| 0.2487 \| 0.1146 \| 0.6997 \| 0.6473 \| 0.6713 \| 0.6518 \|
	\| Instruct -FT \| 97.76% \| 0.3264 \| 0.3602 \| 0.7942 \| 0.8178 \| 0.8042 \| 0.8147 \|


	Training Infos:
	The following hyperparameters were used during training:

	\| Parameter \| Value \|
	\|---------------------------\|----------------------\|
	\| learning_rate \| 1e-5 \|
	\| weight_decay \| 0.0001 \|
	\| train_batch_size \| 1 \|
	\| eval_batch_size \| 1 \|
	\| seed \| 42 \|
	\| optimizer \| Adam - paged_adamw_32bit \|
	\| lr_scheduler_type \| cosine \|
	\| lr_scheduler_warmup_ratio \| 0.03 \|
	\| num_epochs \| 5.0 \|

	QLoRA hyperparameters
	The following parameters related with the Quantized Low-Rank Adaptation and Quantization were used during training:

	\| Parameter \| Value \|
	\|------------------\|---------\|
	\| lora_r \| 16 \|
	\| lora_alpha \| 64 \|
	\| lora_dropout \| 0.1 \|
	\| storage_dtype \| "nf4" \|
	\| compute_dtype \| "float16"\|


	Experiments
	\| Model \| Epochs \| Overfitting \| Final Epochs \| Training Hours \| CO2 Emission (Kg) \|
	\|-----------------------\|--------\|-------------\|--------------\|-----------------\|--------------------\|
	\| Code Llama Instruct \| 1 \| No \| 1 \| 8.1 \| 1.337 \|
	\| Code Llama Instruct \| 5 \| Yes \| 3 \| 45.6 \| 9.12 \|

	Framework versions

	\| Library \| Version \|
	\|---------------\|-----------\|
	\| bitsandbytes \| 0.40.2 \|
	\| Datasets \| 2.14.3 \|
	\| Pytorch \| 2.0.1 \|
	\| Tokenizers \| 0.14.1 \|
	\| Transformers \| 4.34.0 \|