falcon-sql-lora / README.md
revanthkumarg's picture
Finetuned Falcon SQL model
b619554 verified
---
library_name: transformers
tags:
- text-to-sql
- falcon
- lora
- sql-generation
- natural-language-to-sql
- huggingface
---
# Model Card for Falcon SQL Generator (LoRA Fine-Tuned)
A lightweight Falcon-1B model fine-tuned using LoRA on Spider-style SQL generation examples. The model takes in a user query and schema context and generates corresponding SQL queries. It supports few-shot prompting and can be integrated with retrieval-based systems.
## Model Details
### Model Description
This model is a fine-tuned version of `tiiuae/falcon-rw-1b` using Parameter-Efficient Fine-Tuning (LoRA) for the text-to-SQL task. It is trained on custom Spider-style examples that map natural language questions to valid SQL queries over a provided schema context.
- **Developed by:** revanth kumar
- **Finetuned by:** revanth kumar
- **Model type:** Causal Language Model (`AutoModelForCausalLM`)
- **Language(s):** English (natural language input) and SQL (structured query output)
- **License:** Apache 2.0 (inherits from base Falcon model)
- **Finetuned from:** `tiiuae/falcon-rw-1b`
### Model Sources
- **Repository:** [https://huggingface.co/revanthkumarg/falcon-sql-lora](https://huggingface.co/revanthkumarg/falcon-sql-lora)
## Uses
### Direct Use
This model can be directly used for natural language to SQL generation. It supports queries like:
> "List all employees earning more than 50000"
`SELECT * FROM employees WHERE salary > 50000;`
It is suitable for:
- Low-code/no-code query interfaces
- Data analyst assistant tools
- SQL tutoring bots
### Downstream Use
This model can be integrated into retrieval-augmented systems, or further fine-tuned on enterprise-specific schema and query examples.
### Out-of-Scope Use
- It may not generalize well to highly complex, nested, or ambiguous queries.
- Should not be used in production environments involving sensitive financial or health data without further validation.
## Bias, Risks, and Limitations
- The model may generate incorrect or suboptimal SQL if the prompt is ambiguous or the schema context is incomplete.
- It inherits any limitations or biases from the Falcon base model.
### Recommendations
Users should validate generated queries before executing them on production databases. Schema consistency and prompt clarity are key to good performance.
## How to Get Started with the Model
You can load and run the model using the Hugging Face `transformers` library:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("revanthkumarg/falcon-sql-lora")
tokenizer = AutoTokenizer.from_pretrained("tiiuae/falcon-rw-1b")
prompt = "List all employees earning more than 50000:"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.decode(outputs[0]))