felixb85 commited on
Commit
3c11e49
1 Parent(s): 0bbe640

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +47 -2
README.md CHANGED
@@ -20,11 +20,44 @@ It achieves the following results on the evaluation set:
20
 
21
  ## Model description
22
 
23
- More information needed
 
24
 
25
  ## Intended uses & limitations
26
 
27
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
  ## Training and evaluation data
30
 
@@ -32,6 +65,18 @@ More information needed
32
 
33
  ## Training procedure
34
 
 
 
 
 
 
 
 
 
 
 
 
 
35
  ### Training hyperparameters
36
 
37
  The following hyperparameters were used during training:
 
20
 
21
  ## Model description
22
 
23
+ This model uses the T5 tokenizer just for the input and a [custom one](https://huggingface.co/InfAI/sparql-tokenizer) for the SPARQL queries. This
24
+ has lead to a dramatic improvement in performance, albeit not quite usable yet.
25
 
26
  ## Intended uses & limitations
27
 
28
+ Because we used two different tokenizers, you cannot use this model simply in a pipeline. Use the following Python code as a starting point:
29
+
30
+ ```python
31
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
32
+
33
+ model_checkpoint = "InfAI/flan-t5-text2sparql-custom-tokenizer"
34
+ question = "What was the population of Clermont-Ferrand on 1-1-2013?"
35
+ gold_answer = "SELECT ?obj WHERE { wd:Q42168 p:P1082 ?s . ?s ps:P1082 ?obj . ?s pq:P585 ?x filter(contains(YEAR(?x),'2013')) }"
36
+
37
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
38
+
39
+ tokenizer_in = AutoTokenizer.from_pretrained("google/flan-t5-base")
40
+ tokenizer_out = AutoTokenizer.from_pretrained("InfAI/sparql-tokenizer")
41
+
42
+ sample = f"Create SPARQL Query: {question}"
43
+
44
+ inputs = tokenizer_in([sample], return_tensors="pt")
45
+ outputs = model.generate(**inputs)
46
+
47
+ print(f"Gold answer: {gold_answer}")
48
+ print( " " + tokenizer_out.decode(outputs[0]))
49
+ ```
50
+
51
+ ```
52
+ Gold answer: SELECT ?obj WHERE { wd:Q42168 p:P1082 ?s . ?s ps:P1082 ?obj . ?s pq:P585 ?x filter(contains(YEAR(?x),'2013')) }
53
+ <pad> SELECT?obj WHERE { wd:Q4754 p:P1082?s.?s ps:P1082?obj.?s pq:P585?x filter(contains(YEAR(?x),'2013')) }
54
+ ```
55
+
56
+ Common errors include:
57
+
58
+ - Adding a `<pad>` token at the beginning
59
+ - A stray closed curly brace at the end
60
+ - One of subject / predicate / object is wrong, while the other two are correct
61
 
62
  ## Training and evaluation data
63
 
 
65
 
66
  ## Training procedure
67
 
68
+ We trained the model for 50 epochs, which was way over the top. The loss stagnates after about 25 epochs and looking manually
69
+ at some examples from the validation set showed us that the queries do not improve beyond this point using these hyperparameters.
70
+ We were aware that the number of epochs was probably too high, but our goal was to find out how many epochs were beneficial
71
+ to the performance.
72
+
73
+ There are two avenues we will explore to get rid of these errors:
74
+
75
+ - Continue training with different hyperparameters
76
+ - Apply more preprocessing to the dataset
77
+
78
+ The results will be uploaded to this repo.
79
+
80
  ### Training hyperparameters
81
 
82
  The following hyperparameters were used during training: