aarohanverma commited on
Commit
0002b39
·
verified ·
1 Parent(s): 5f4b5a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -11
README.md CHANGED
@@ -112,9 +112,13 @@ The model was fine-tuned on a concatenation of several publicly available text-t
112
  4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
113
 
114
  **Data Split:**
115
- - **Training:** 85%
116
- - **Validation:** 5%
117
- - **Testing:** 10%
 
 
 
 
118
 
119
  ### Training Procedure
120
 
@@ -164,6 +168,8 @@ and tokenized with a maximum length of 512 for inputs and 256 for responses usin
164
 
165
  <!-- This section describes the evaluation protocols and provides the results. -->
166
 
 
 
167
  #### Metrics
168
 
169
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
@@ -178,14 +184,14 @@ Evaluation metrics used:
178
 
179
  The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
180
 
181
- | **Metric** | **Original Model** | **Fine-Tuned Model** | **Improvement Commentary** |
182
- |---------------------------|-------------------------------|-------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------|
183
- | **ROUGE-1** | 0.03369 | **0.69143** | Over 20× increase; indicates much better content capture. |
184
- | **ROUGE-2** | 0.00817 | **0.54533** | Nearly 67× improvement; higher n-gram quality. |
185
- | **ROUGE-L** | 0.03056 | **0.66429** | More than 21× increase; improved sequence similarity. |
186
- | **BLEU Score** | 0.00367 | **0.31698** | Approximately 86× increase; demonstrates significant fluency gains. |
187
- | **Fuzzy Match Score** | 11.31% | **81.98%** | Substantial improvement; generated SQL aligns much closer with human responses. |
188
- | **Exact Match Accuracy** | 0.00% | **16.39%** | Non-zero accuracy achieved; critical for production-readiness. |
189
 
190
 
191
  #### Summary
 
112
  4. **[knowrohit07/know_sql](https://huggingface.co/datasets/knowrohit07/know_sql)**
113
 
114
  **Data Split:**
115
+
116
+ | **Split** | **Percentage** | **Number of Samples** |
117
+ |----------------------|--------------------|--------------------------|
118
+ | **Training** | 85% | **338,708** |
119
+ | **Validation** | 5% | **19,925** |
120
+ | **Testing** | 10% | **39,848** |
121
+
122
 
123
  ### Training Procedure
124
 
 
168
 
169
  <!-- This section describes the evaluation protocols and provides the results. -->
170
 
171
+ The model was evaluated on 39,848 test samples, representing 10% of the dataset.
172
+
173
  #### Metrics
174
 
175
  <!-- These are the evaluation metrics being used, ideally with a description of why. -->
 
184
 
185
  The table below summarizes the evaluation metrics comparing the original base model with the fine-tuned model:
186
 
187
+ | **Metric** | **Original Model** | **Fine-Tuned Model** | **Comments** |
188
+ |---------------------------|-------------------------------|-------------------------|--------------------------------------------------------------------------------------|
189
+ | **ROUGE-1** | 0.03369 | **0.69143** | Over 20× increase; indicates much better content capture. |
190
+ | **ROUGE-2** | 0.00817 | **0.54533** | Nearly 67× improvement; higher n-gram quality. |
191
+ | **ROUGE-L** | 0.03056 | **0.66429** | More than 21× increase; improved sequence similarity. |
192
+ | **BLEU Score** | 0.00367 | **0.31698** | Approximately 86× increase; demonstrates significant fluency gains. |
193
+ | **Fuzzy Match Score** | 11.31% | **81.98%** | Substantial improvement; generated SQL aligns much closer with human responses. |
194
+ | **Exact Match Accuracy** | 0.00% | **16.39%** | Non-zero accuracy achieved; critical for production-readiness. |
195
 
196
 
197
  #### Summary