TinyLlama-7B (Fine-tuned for Text-to-SQL)
TinyLlama-7B is a fine-tuned version of the Llama-2 7B model, specifically trained to handle Text-to-SQL tasks. It is designed to efficiently translate natural language queries into structured SQL queries, making it ideal for use in applications requiring database interactions from natural language instructions. With a smaller model size compared to larger variants, TinyLlama-7B offers a balance between performance and efficiency for environments with limited computational resources.
Model Details
- Model Name: TinyLlama-7B (Fine-tuned for Text-to-SQL)
- Base Model: Llama-2 7B
- Model Type: Fine-tuned Transformer-based language model
- Parameter Size: 7 billion parameters
- Fine-tuning Tasks: Text-to-SQL, Natural Language to Structured Query Translation
- License: Custom commercial license (please refer to original Llama-2 model license)
Intended Use
Use Cases:
- Text-to-SQL: Translating natural language questions into executable SQL queries for database retrieval tasks.
- Database Management: Assisting with creating, modifying, and querying databases using natural language.
- General NLP: Handling basic language tasks like question answering, summarization, and text classification when combined with SQL-related tasks.
Out-of-scope Uses:
- Harmful Content Generation: Generating biased or harmful content, or violating local laws.
- Languages Other Than English: Primary focus is on English for database queries and SQL generation.
- Non-SQL Tasks: Not intended for use in tasks outside of text-to-SQL or related language generation.
Model Performance
TinyLlama-7B has been fine-tuned specifically for the task of converting natural language queries into SQL queries. The fine-tuning data included a diverse set of SQL query generation tasks, making the model capable of handling complex queries, joins, aggregations, and filtering operations.
Evaluation on Text-to-SQL Tasks:
- High accuracy in translating natural language into SQL: TinyLlama-7B performs well in generating accurate SQL queries from user inputs, even for complex requests involving multiple tables and conditions.
- Efficient and Fast: The smaller size ensures lower latency and faster inference time compared to larger models, making it more suitable for environments with limited resources.
Training Data
TinyLlama-7B was fine-tuned using a variety of publicly available datasets focused on SQL generation, including text-to-SQL datasets and query-generation tasks. The model was trained on data that includes diverse table schemas, SQL queries, and natural language questions to improve its performance on Text-to-SQL tasks.