File size: 2,568 Bytes
7645cd1 94091ea 7645cd1 94091ea b14653c de44594 94091ea de44594 94091ea 78921da 94091ea de44594 94091ea |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 |
---
license: apache-2.0
tags:
- generated_from_trainer
datasets:
- lc_quad
model-index:
- name: flan-t5-text2sparql-naive
results: []
---
# flan-t5-text2sparql-naive
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the lc_quad dataset.
It achieves the following results on the evaluation set:
- Loss: 0.4105
## Model description
T5 has performed well in generating SPARQL queries from natural text, but semi automated preprocessing was necessary ([Banerjee et.al.](https://dl.acm.org/doi/abs/10.1145/3477495.3531841)).
FLAN-T5 comes with the promise of being better than T5 across all categories, so a re-evaluation is needed. Our goal is to find
out what kind of preprocessing is still necessary to retain good performance, as well as how to automate it fully.
This is the naive version of the fine-tuned LLM, blindly applying the same tokenizer both on the natural language question as well as the target SPARQL query.
## Intended uses & limitations
This model performs very bad, do not use it! We wanted to find out whether preprocessing is still necessary or T5 can figure things out on its own. As it turns out, preprocessing
is still needed, so this model will just serve as some kind of baseline.
An example:
```
Create SPARQL Query: What was the population of Clermont-Ferrand on 1-1-2013?
```
```
'SELECT ?obj WHERE wd:Q2'}
```
## Training and evaluation data
LC_QUAD 2.0, see sidebar.
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- gradient_accumulation_steps: 4
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
### Training results
| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| No log | 1.0 | 301 | 0.5173 |
| 0.6515 | 2.0 | 602 | 0.4861 |
| 0.6515 | 3.0 | 903 | 0.4639 |
| 0.4954 | 4.0 | 1204 | 0.4478 |
| 0.4627 | 5.0 | 1505 | 0.4340 |
| 0.4627 | 6.0 | 1806 | 0.4247 |
| 0.4404 | 7.0 | 2107 | 0.4177 |
| 0.4404 | 8.0 | 2408 | 0.4139 |
| 0.429 | 9.0 | 2709 | 0.4115 |
| 0.4201 | 10.0 | 3010 | 0.4105 |
### Framework versions
- Transformers 4.18.0
- Pytorch 1.10.2+cu102
- Datasets 2.4.0
- Tokenizers 0.12.1
|