Update README.md
Browse files
README.md
CHANGED
@@ -9,8 +9,6 @@ model-index:
|
|
9 |
results: []
|
10 |
---
|
11 |
|
12 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
-
should probably proofread and complete it, then remove this comment. -->
|
14 |
|
15 |
# flan-t5-text2sparql-naive
|
16 |
|
@@ -20,15 +18,20 @@ It achieves the following results on the evaluation set:
|
|
20 |
|
21 |
## Model description
|
22 |
|
23 |
-
|
|
|
|
|
|
|
|
|
24 |
|
25 |
## Intended uses & limitations
|
26 |
|
27 |
-
|
|
|
28 |
|
29 |
## Training and evaluation data
|
30 |
|
31 |
-
|
32 |
|
33 |
## Training procedure
|
34 |
|
|
|
9 |
results: []
|
10 |
---
|
11 |
|
|
|
|
|
12 |
|
13 |
# flan-t5-text2sparql-naive
|
14 |
|
|
|
18 |
|
19 |
## Model description
|
20 |
|
21 |
+
T5 has performed well in generating SPARQL queries from natural text, but semi automated preprocessing was necessary [Banerjee et.al.](https://dl.acm.org/doi/abs/10.1145/3477495.3531841).
|
22 |
+
FLAN-T5 comes with the promise of being better than T5 across all categories, so a re-evaluation is needed. Our goal is to find
|
23 |
+
out what kind of preprocessing is still necessary to retain good performance, as well as how to automate it fully.
|
24 |
+
|
25 |
+
This is the naive version of the fine-tuned LLM, blindly applying the same tokenizer both on the natural language question as well as the target SPARQL query.
|
26 |
|
27 |
## Intended uses & limitations
|
28 |
|
29 |
+
This model performs very bad, do not use it! We wanted to find out whether preprocessing is still necessary or T5 can figure things out on its own. As it turns out, preprocessing
|
30 |
+
is still needed, so this model will just serve as some kind of baseline.
|
31 |
|
32 |
## Training and evaluation data
|
33 |
|
34 |
+
LC_QUAD 2.0, see sidebar.
|
35 |
|
36 |
## Training procedure
|
37 |
|