Yuk050
/

gemma-3-1b-text-to-sql-model

@@ -1,12 +1,124 @@
 ---
-license: gemma
 datasets:
 - gretelai/synthetic_text_to_sql
-language:
 - en
 base_model:
 - google/gemma-3-1b-it
 pipeline_tag: text2text-generation
 ---
-Test

 ---
+pretty_name: Gemma 3 1B Text-to-SQL Finetuned
 datasets:
 - gretelai/synthetic_text_to_sql
+language:
 - en
 base_model:
 - google/gemma-3-1b-it
 pipeline_tag: text2text-generation
+tags:
+- text-to-sql
+- qlora
+- finetuned
+- gemma
+- llm
+- sql
+license: gemma
 ---
+# Model Card: Gemma 3 1B Text-to-SQL Finetuned
+## Model Description
+This model is a finetuned version of the `google/gemma-3-1b` large language model, specifically adapted for the text-to-SQL task. It leverages Quantized Low-Rank Adaptation (QLoRA) for efficient finetuning, making it suitable for deployment and inference on systems with limited computational resources.
+The primary function of this model is to translate natural language questions and provided database schemas into executable SQL queries. This capability is crucial for applications requiring natural language interaction with databases, such as business intelligence tools, data analysis platforms, and conversational AI agents.
+## Intended Use
+This model is intended for research and development purposes related to text-to-SQL generation. It can be used to:
+*   Generate SQL queries from natural language prompts and database schemas.
+*   Serve as a component in larger systems that require natural language interaction with databases.
+*   Further research into efficient finetuning techniques for large language models.
+### Out-of-Scope Use Cases
+This model is not intended for:
+*   Generating SQL queries for highly sensitive or mission-critical systems without thorough validation and human oversight.
+*   Deployment in production environments without rigorous testing and adherence to security best practices.
+*   Generating SQL for databases with unknown or complex schemas without proper adaptation and training.
+## Training Data
+The model was finetuned on the `gretelai/synthetic_text_to_sql` dataset [1]. This dataset is a high-quality, synthetically generated collection of text-to-SQL samples. Key characteristics of the dataset include:
+*   **Size**: 105,851 records (100,000 for training, 5,851 for testing).
+*   **Content**: Each record includes a natural language prompt (`sql_prompt`), database schema (`sql_context` as `CREATE TABLE` statements), the corresponding SQL query (`sql`), and an explanation of the SQL query (`sql_explanation`).
+*   **Diversity**: Covers 100 distinct domains/verticals and a wide range of SQL complexity levels (e.g., aggregations, joins, subqueries, window functions).
+The training data was transformed into a conversational format, where the user provides the database schema and natural language query, and the assistant responds with the SQL query. An example of the input format is:
+```
+Given the following database schema:
+CREATE TABLE Employees (id INT, name VARCHAR(255), salary INT);
+Generate the SQL query for: Select all employees with salary greater than 50000
+```
+## Training Procedure
+The model was finetuned using the QLoRA technique, implemented with the `unsloth` library. The training was performed using the `SFTTrainer` from the `trl` library.
+**Base Model**: `google/gemma-3-1b`
+**Finetuning Parameters (QLoRA)**:
+*   **LoRA Rank (`r`)**: 16
+*   **LoRA Alpha (`lora_alpha`)**: 16
+*   **LoRA Dropout (`lora_dropout`)**: 0.05
+*   **Bias**: `none`
+*   **Target Modules**: `q_proj`, `k_proj`, `v_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`
+*   **Gradient Checkpointing**: Enabled (`unsloth` optimized)
+**Training Arguments (`SFTTrainer`)**:
+*   **Per Device Train Batch Size**: 2
+*   **Gradient Accumulation Steps**: 4
+*   **Warmup Steps**: 5
+*   **Max Steps**: 100 (can be adjusted for full dataset training)
+*   **Learning Rate**: 2e-4
+*   **Optimizer**: `adamw_8bit`
+*   **Precision**: `bf16` (if supported by GPU), otherwise `fp16`
+## Model Architecture
+The Gemma 3 1B model is a decoder-only transformer architecture. During QLoRA finetuning, low-rank adapters are injected into the specified layers, allowing for efficient training by only updating a small fraction of the model's parameters while keeping the majority of the pre-trained weights frozen in 4-bit quantized form.
+## Performance and Limitations
+Due to the synthetic nature of the training data, the model's performance on real-world, noisy, or highly complex database schemas may vary. It is recommended to perform further evaluation and potentially finetune on domain-specific data for production use cases.
+**Limitations include:**
+*   **Schema Complexity**: May struggle with highly intricate database schemas or those with ambiguous column names.
+*   **Natural Language Ambiguity**: Performance can be affected by ambiguous or underspecified natural language queries.
+*   **SQL Dialect**: Primarily trained on standard SQL syntax. May require further adaptation for specific SQL dialects (e.g., PostgreSQL, MySQL, SQL Server).
+## Environmental Impact
+Finetuning with QLoRA significantly reduces the computational resources and energy consumption compared to full finetuning. The specific energy consumption for this finetuning run would depend on the hardware used and the duration of training.
+## Citation
+If you use this model or the finetuning approach, please consider citing the original Gemma model and the `gretelai/synthetic_text_to_sql` dataset:
+```bibtex
+@article{gemma2024,
+  author = {Google},
+  title = {Gemma: A Family of Lightweight, State-of-the-Art Open Models},
+  year = {2024},
+  url = {https://ai.google.dev/gemma}
+}
+@software{gretel-synthetic-text-to-sql-2024,
+  author = {Meyer, Yev and Emadi, Marjan and Nathawani, Dhruv and Ramaswamy, Lipika and Boyd, Kendrick and Van Segbroeck, Maarten and Grossman, Matthew and Mlocek, Piotr and Newberry, Drew},
+  title = {{Synthetic-Text-To-SQL}: A synthetic dataset for training language models to generate SQL queries from natural language prompts},
+  month = {April},
+  year = {2024},
+  url = {https://huggingface.co/datasets/gretelai/synthetic-text-to-sql}
+}
+```