|
--- |
|
library_name: transformers |
|
tags: |
|
- text-to-sql |
|
- falcon |
|
- lora |
|
- sql-generation |
|
- natural-language-to-sql |
|
- huggingface |
|
--- |
|
|
|
# Model Card for Falcon SQL Generator (LoRA Fine-Tuned) |
|
|
|
A lightweight Falcon-1B model fine-tuned using LoRA on Spider-style SQL generation examples. The model takes in a user query and schema context and generates corresponding SQL queries. It supports few-shot prompting and can be integrated with retrieval-based systems. |
|
|
|
## Model Details |
|
|
|
### Model Description |
|
|
|
This model is a fine-tuned version of `tiiuae/falcon-rw-1b` using Parameter-Efficient Fine-Tuning (LoRA) for the text-to-SQL task. It is trained on custom Spider-style examples that map natural language questions to valid SQL queries over a provided schema context. |
|
|
|
- **Developed by:** revanth kumar |
|
- **Finetuned by:** revanth kumar |
|
- **Model type:** Causal Language Model (`AutoModelForCausalLM`) |
|
- **Language(s):** English (natural language input) and SQL (structured query output) |
|
- **License:** Apache 2.0 (inherits from base Falcon model) |
|
- **Finetuned from:** `tiiuae/falcon-rw-1b` |
|
|
|
### Model Sources |
|
|
|
- **Repository:** [https://huggingface.co/revanthkumarg/falcon-sql-lora](https://huggingface.co/revanthkumarg/falcon-sql-lora) |
|
|
|
## Uses |
|
|
|
### Direct Use |
|
|
|
This model can be directly used for natural language to SQL generation. It supports queries like: |
|
|
|
> "List all employees earning more than 50000" |
|
→ `SELECT * FROM employees WHERE salary > 50000;` |
|
|
|
It is suitable for: |
|
- Low-code/no-code query interfaces |
|
- Data analyst assistant tools |
|
- SQL tutoring bots |
|
|
|
### Downstream Use |
|
|
|
This model can be integrated into retrieval-augmented systems, or further fine-tuned on enterprise-specific schema and query examples. |
|
|
|
### Out-of-Scope Use |
|
|
|
- It may not generalize well to highly complex, nested, or ambiguous queries. |
|
- Should not be used in production environments involving sensitive financial or health data without further validation. |
|
|
|
## Bias, Risks, and Limitations |
|
|
|
- The model may generate incorrect or suboptimal SQL if the prompt is ambiguous or the schema context is incomplete. |
|
- It inherits any limitations or biases from the Falcon base model. |
|
|
|
### Recommendations |
|
|
|
Users should validate generated queries before executing them on production databases. Schema consistency and prompt clarity are key to good performance. |
|
|
|
## How to Get Started with the Model |
|
|
|
You can load and run the model using the Hugging Face `transformers` library: |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained("revanthkumarg/falcon-sql-lora") |
|
tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-rw-1b") |
|
|
|
prompt = "List all employees earning more than 50000:" |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_new_tokens=100) |
|
print(tokenizer.decode(outputs[0])) |
|
|