Read me is ready
Browse files
README.md
ADDED
@@ -0,0 +1,44 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Output:
|
2 |
+
|
3 |
+
``{<br/>'cat': ['mammal', 'animal'], <br/> 'dog': ['hound', 'animal'], <br/>'economics and sociology': ['both fields of study'], <br/>'public company': ['company']<br/>}
|
4 |
+
|
5 |
+
### How was it trained?``
|
6 |
+
|
7 |
+
1. Using Google's T5-base and T5-small. Both models are released on the Hugging Face Hub.
|
8 |
+
2. T5-base was trained for only two epochs while T5-small was trained for 5 epochs.
|
9 |
+
|
10 |
+
## Where did you get the data?
|
11 |
+
|
12 |
+
1. I extracted and curated a fragment of [Conceptnet](https://conceptnet.io/)
|
13 |
+
2. In particular, only the IsA relation was used.
|
14 |
+
3. Note that one thing can belong to multiple concepts (which is pretty cool if you think about [Fuzzy Description Logics](https://lat.inf.tu-dresden.de/~stefborg/Talks/QuantLAWorkshop2013.pdf)).
|
15 |
+
Multiple inheritances however mean some terms belong to so many concepts. Hence, I decided to randomly throw away some due to the **maximum length limitation**.
|
16 |
+
|
17 |
+
### Setup
|
18 |
+
1. I finally allowed only `2` to `4` concepts at random for each term. This means, there is still great potential to make the models generalise better 🚀.
|
19 |
+
3. I used a total of `279884` training examples and `1260` for testing. Edges -- i.e `IsA(concept u, concept v)` -- in both sets are disjoint.
|
20 |
+
4. Trained for `15K` steps with learning rate linear decay during each step. Starting at `0.001`
|
21 |
+
5. Used `RAdam Optimiser` with weight_decay =`0.01` and batch_size =`36`.
|
22 |
+
6. Source and target max length were both `64`.
|
23 |
+
|
24 |
+
### Multilingual Models
|
25 |
+
|
26 |
+
1. The "conceptor" model is multilingual. English, German and French is supported.
|
27 |
+
2. [Conceptnet](https://conceptnet.io/) supports many languages, but I just chose those three because those are the ones I speak.
|
28 |
+
|
29 |
+
### Metrics for flexudy-conceptor-t5-base
|
30 |
+
|
31 |
+
| Metric | Score |
|
32 |
+
| ------------- |:-------------:|
|
33 |
+
| Exact Match | 36.67 |
|
34 |
+
| F1 | 43.08 |
|
35 |
+
| Loss smooth | 1.214 |
|
36 |
+
|
37 |
+
Unfortunately, we no longer have the metrics for flexudy-conceptor-t5-small. If I recall correctly, base was just slightly better on the test set (ca. `2%` F1).
|
38 |
+
|
39 |
+
## Why not just use the data if you have it structured already?
|
40 |
+
|
41 |
+
Conceptnet is very large. Even if you just consider loading a fragment into your RAM, say with only 100K edges, this is still a large graph.
|
42 |
+
Especially, if you think about how you will save the node embeddings efficiently for querying.
|
43 |
+
If you prefer this approach, [Milvus](https://github.com/milvus-io/pymilvus) can be of great help.
|
44 |
+
You can compute query embeddings and try to find the best match. From there (after matching), you can navigate through the graph at `100%` precision.
|