--- base_model: mistralai/Mistral-7B-Instruct-v0.3 datasets: - generator library_name: peft license: apache-2.0 tags: - trl - sft - generated_from_trainer model-index: - name: Mistral-7B-text-to-sql-flash-attention-2-dataeval results: [] --- # Mistral-7B-text-to-sql-flash-attention-2-dataeval This model is a fine-tuned version of [mistralai/Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3) on the generator dataset. It achieves the following results on the evaluation set: - Loss: 0.4605 Perplexity of 10.40 Perplexity: Perplexity is a measure of how uncertain or surprised the model is about its predictions. It's derived from the probabilities the model assigns to different words or tokens. Perplexity Article: https://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf https://medium.com/@AyushmanPranav/perplexity-calculation-in-nlp-0699fbda4594 The perplexity of 10.40 achieved on the dataset indicates that the fine-tuned Mistral-7B model reasonably understands natural language and SQL syntax. However, further evaluation using task-specific metrics is necessary to assess the model's effectiveness in real-world scenarios. By combining quantitative metrics like perplexity with qualitative analysis of generated queries, we can comprehensively understand the model's strengths and weaknesses, ultimately leading to improved performance and more reliable text-to-SQL translation capabilities. Dataset : [b-mc2/sql-create-context](https://huggingface.co/datasets/b-mc2/sql-create-context) ## Model description Article: https://medium.com/@frankmorales_91352/fine-tuning-the-llm-mistral-7b-instruct-v0-3-249c1814ceaf ## Training and evaluation data Fine Tuning and Evaluation: https://github.com/frank-morales2020/MLxDL/blob/main/FineTuning_LLM_Mistral_7B_Instruct_v0_1_for_text_to_SQL_EVALDATA.ipynb Evaluation: https://github.com/frank-morales2020/MLxDL/blob/main/Evaluator_Mistral_7B_text_to_sql.ipynb Evaluation article with Chromadb: https://medium.com/@frankmorales_91352/a-comprehensive-evaluation-of-a-fine-tuned-text-to-sql-model-from-code-to-results-with-7ea59943b0a1 Evaluation article with Chromadb, PostgreSQL and the “gretelai/synthetic_text_to_sql” dataset: https://medium.com/@frankmorales_91352/evaluating-the-performance-of-a-fine-tuned-text-to-sql-model-6b7d61dcfef5 ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0002 - train_batch_size: 3 - eval_batch_size: 8 - seed: 42 - gradient_accumulation_steps: 8 - total_train_batch_size: 24 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: constant - lr_scheduler_warmup_ratio: 0.03 - lr_scheduler_warmup_steps: 15 - num_epochs: 3 from transformers import TrainingArguments args = TrainingArguments( output_dir="Mistral-7B-text-to-sql-flash-attention-2-dataeval", num_train_epochs=3, # number of training epochs per_device_train_batch_size=3, # batch size per device during training gradient_accumulation_steps=8, #2 # number of steps before performing a backward/update pass gradient_checkpointing=True, # use gradient checkpointing to save memory optim="adamw_torch_fused", # use fused adamw optimizer logging_steps=10, # log every ten steps #save_strategy="epoch", # save checkpoint every epoch learning_rate=2e-4, # learning rate, based on QLoRA paper bf16=True, # use bfloat16 precision tf32=True, # use tf32 precision max_grad_norm=0.3, # max gradient norm based on QLoRA paper warmup_ratio=0.03, # warmup ratio based on QLoRA paper weight_decay=0.01, lr_scheduler_type="constant", # use constant learning rate scheduler push_to_hub=True, # push model to hub report_to="tensorboard", # report metrics to tensorboard hub_token=access_token_write, # Add this line load_best_model_at_end=True, logging_dir="/content/drive/MyDrive/model/Mistral-7B-text-to-sql-flash-attention-2-dataeval/logs", evaluation_strategy="steps", eval_steps=10, save_strategy="steps", save_steps=10, metric_for_best_model = "loss", warmup_steps=15, ) ### Training results | Training Loss | Epoch | Step | Validation Loss | |:-------------:|:------:|:----:|:---------------:| | 1.8612 | 0.4020 | 10 | 0.6092 | | 0.5849 | 0.8040 | 20 | 0.5307 | | 0.4937 | 1.2060 | 30 | 0.4887 | | 0.4454 | 1.6080 | 40 | 0.4670 | | 0.425 | 2.0101 | 50 | 0.4544 | | 0.3498 | 2.4121 | 60 | 0.4717 | | 0.3439 | 2.8141 | 70 | 0.4605 | ### Framework versions - PEFT 0.11.1 - Transformers 4.41.2 - Pytorch 2.3.0+cu121 - Datasets 2.20.0 - Tokenizers 0.19.1