aarohanverma
/

text2sql-flan-t5-base-qlora-finetuned

@@ -112,9 +112,13 @@ The model was fine-tuned on a concatenation of several publicly available text-t
 4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
 **Data Split:**
-- **Training:** 85%
-- **Validation:** 5%
-- **Testing:** 10%
 ### Training Procedure
@@ -164,6 +168,8 @@ and tokenized with a maximum length of 512 for inputs and 256 for responses usin
 <!-- This section describes the evaluation protocols and provides the results. -->
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
@@ -178,14 +184,14 @@ Evaluation metrics used:
 The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
-| **Metric**                | **Original Model**            | **Fine-Tuned Model**                                                                                        | **Improvement Commentary**                                              |
-|---------------------------|-------------------------------|-------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|
-| **ROUGE-1**               | 0.03369                       | **0.69143**                                                                                                 | Over 20× increase; indicates much better content capture.              |
-| **ROUGE-2**               | 0.00817                       | **0.54533**                                                                                                 | Nearly 67× improvement; higher n-gram quality.                          |
-| **ROUGE-L**               | 0.03056                       | **0.66429**                                                                                                 | More than 21× increase; improved sequence similarity.                  |
-| **BLEU Score**            | 0.00367                       | **0.31698**                                                                                                 | Approximately 86× increase; demonstrates significant fluency gains.      |
-| **Fuzzy Match Score**     | 11.31%                        | **81.98%**                                                                                                  | Substantial improvement; generated SQL aligns much closer with human responses. |
-| **Exact Match Accuracy**  | 0.00%                         | **16.39%**                                                                                                  | Non-zero accuracy achieved; critical for production-readiness.          |
 #### Summary

 4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
 **Data Split:**
+| **Split**            | **Percentage**     | **Number of Samples**    |
+|----------------------|--------------------|--------------------------|
+| **Training**         | 85%                | **338,708**              |
+| **Validation**       | 5%                 | **19,925**               |
+| **Testing**          | 10%                | **39,848**               |
 ### Training Procedure
 <!-- This section describes the evaluation protocols and provides the results. -->
+The model was evaluated on 39,848 test samples, representing 10% of the dataset.
 #### Metrics
 <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
+| **Metric**                | **Original Model**            | **Fine-Tuned Model**    | **Comments**                                                                         |
+|---------------------------|-------------------------------|-------------------------|--------------------------------------------------------------------------------------|
+| **ROUGE-1**               | 0.03369                       | **0.69143**             | Over 20× increase; indicates much better content capture.                            |
+| **ROUGE-2**               | 0.00817                       | **0.54533**             | Nearly 67× improvement; higher n-gram quality.                                       |
+| **ROUGE-L**               | 0.03056                       | **0.66429**             | More than 21× increase; improved sequence similarity.                                |
+| **BLEU Score**            | 0.00367                       | **0.31698**             | Approximately 86× increase; demonstrates significant fluency gains.                  |
+| **Fuzzy Match Score**     | 11.31%                        | **81.98%**              | Substantial improvement; generated SQL aligns much closer with human responses.      |
+| **Exact Match Accuracy**  | 0.00%                         | **16.39%**              | Non-zero accuracy achieved; critical for production-readiness.                       |
 #### Summary