Update README.md
Browse files
README.md
CHANGED
@@ -112,9 +112,13 @@ The model was fine-tuned on a concatenation of several publicly available text-t
|
|
112 |
4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
|
113 |
|
114 |
**Data Split:**
|
115 |
-
|
116 |
-
|
117 |
-
|
|
|
|
|
|
|
|
|
118 |
|
119 |
### Training Procedure
|
120 |
|
@@ -164,6 +168,8 @@ and tokenized with a maximum length of 512 for inputs and 256 for responses usin
|
|
164 |
|
165 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
166 |
|
|
|
|
|
167 |
#### Metrics
|
168 |
|
169 |
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
@@ -178,14 +184,14 @@ Evaluation metrics used:
|
|
178 |
|
179 |
The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
|
180 |
|
181 |
-
| **Metric** | **Original Model** | **Fine-Tuned Model**
|
182 |
-
|
183 |
-
| **ROUGE-1** | 0.03369 | **0.69143**
|
184 |
-
| **ROUGE-2** | 0.00817 | **0.54533**
|
185 |
-
| **ROUGE-L** | 0.03056 | **0.66429**
|
186 |
-
| **BLEU Score** | 0.00367 | **0.31698**
|
187 |
-
| **Fuzzy Match Score** | 11.31% | **81.98%**
|
188 |
-
| **Exact Match Accuracy** | 0.00% | **16.39%**
|
189 |
|
190 |
|
191 |
#### Summary
|
|
|
112 |
4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
|
113 |
|
114 |
**Data Split:**
|
115 |
+
|
116 |
+
| **Split** | **Percentage** | **Number of Samples** |
|
117 |
+
|----------------------|--------------------|--------------------------|
|
118 |
+
| **Training** | 85% | **338,708** |
|
119 |
+
| **Validation** | 5% | **19,925** |
|
120 |
+
| **Testing** | 10% | **39,848** |
|
121 |
+
|
122 |
|
123 |
### Training Procedure
|
124 |
|
|
|
168 |
|
169 |
<!-- This section describes the evaluation protocols and provides the results. -->
|
170 |
|
171 |
+
The model was evaluated on 39,848 test samples, representing 10% of the dataset.
|
172 |
+
|
173 |
#### Metrics
|
174 |
|
175 |
<!-- These are the evaluation metrics being used, ideally with a description of why. -->
|
|
|
184 |
|
185 |
The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
|
186 |
|
187 |
+
| **Metric** | **Original Model** | **Fine-Tuned Model** | **Comments** |
|
188 |
+
|---------------------------|-------------------------------|-------------------------|--------------------------------------------------------------------------------------|
|
189 |
+
| **ROUGE-1** | 0.03369 | **0.69143** | Over 20× increase; indicates much better content capture. |
|
190 |
+
| **ROUGE-2** | 0.00817 | **0.54533** | Nearly 67× improvement; higher n-gram quality. |
|
191 |
+
| **ROUGE-L** | 0.03056 | **0.66429** | More than 21× increase; improved sequence similarity. |
|
192 |
+
| **BLEU Score** | 0.00367 | **0.31698** | Approximately 86× increase; demonstrates significant fluency gains. |
|
193 |
+
| **Fuzzy Match Score** | 11.31% | **81.98%** | Substantial improvement; generated SQL aligns much closer with human responses. |
|
194 |
+
| **Exact Match Accuracy** | 0.00% | **16.39%** | Non-zero accuracy achieved; critical for production-readiness. |
|
195 |
|
196 |
|
197 |
#### Summary
|