Adhishtanaka commited on
Commit
42a4f11
·
verified ·
1 Parent(s): dfb2f93

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +48 -3
README.md CHANGED
@@ -1,3 +1,48 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fine-tuning LLaMA 3.2 1B for SQL Generation
2
+
3
+ This project is about fine-tuning a small LLaMA model (1B) to generate SQL queries from natural language. I'm using a dataset that contains examples of how people ask questions and how those get translated into SQL.
4
+
5
+ ## What I'm Doing
6
+
7
+ * I'm starting with a pre-trained LLaMA 3.2 1B model.
8
+ * I use a dataset called `synthetic_text_to_sql-ShareGPT` which has examples of prompts and the corresponding SQL queries.
9
+ Dataset URL: [https://huggingface.co/datasets/mlabonne/synthetic_text_to_sql-ShareGPT](https://huggingface.co/datasets/mlabonne/synthetic_text_to_sql-ShareGPT)
10
+ * I fine-tune the model using Unsloth libary with LoRA Adapters. This allows me to train only parts of the model, which makes it much faster and memory-efficient.
11
+
12
+ ## Evaluation Process
13
+
14
+ The evaluation pipeline is implemented in `Evaluate_LLM.ipynb`:
15
+
16
+ 1. **SQL Question Generation** : Groq’s `llama3-8b-8192` model generates 10 SQL question blocks, each with table creation, inserts, and a natural language SQL question.
17
+
18
+ 2. **Model Answering** : Each question is passed to a local fine-tuned LLaMA model (using `llama-cpp-python`) to generate SQL queries and explanations.
19
+
20
+ 3. **Automated Evaluation** : Groq’s `gemma2-9b-it` model acts as an expert tutor to score each (question, answer) pair on correctness and completeness (1–10 scale) and provide feedback.
21
+
22
+ 4. **Summary** : Average scores and detailed feedback for all questions are output.
23
+
24
+ *Note:*
25
+ - The question generation and evaluation both use Groq's hosted models (Llama 3_8b for question generation, Gemma 2_9b for evaluation).
26
+ - The local LLaMA_3.2_1b fine tuned model is only used for generating answers.
27
+ - Normally, I use Gemini for evaluation, but due to Gemini being slow today, I used Groq for both question generation and evaluation in this run.
28
+
29
+ ## Why I’m Doing This
30
+
31
+ I want to build a model that can understand plain English and generate accurate SQL queries. This can be useful for tools where people want to ask questions about their data without writing SQL themselves.
32
+
33
+ ## Where to Find the Model & Notebooks
34
+
35
+ You can find the fine-tuned model, including the .gguf file format for easy local use, on my Hugging Face repository:
36
+
37
+ 👉 https://huggingface.co/Adhishtanaka/llama_3.2_1b_SQL/tree/main
38
+
39
+ You can find the Jupyter Notebook files used in this project directly in this repository:
40
+
41
+ - `Evaluate_LLM.ipynb`: The evaluation pipeline for the fine-tuned model.
42
+ - `Llama3.2_1B-SQL.ipynb`: The main notebook for fine-tuning and experimentation.
43
+
44
+ 👉 Browse these files in the [GitHub repository](https://github.com/Adhishtanaka/llama3.2_1.b-SQL) for full code and documentation.
45
+
46
+ ---
47
+ license: mit
48
+ ---