File size: 6,049 Bytes
8128172
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a17cb10
8128172
 
 
a17cb10
8128172
 
 
 
99c6104
8128172
 
99c6104
8128172
 
 
 
 
 
 
a17cb10
 
 
 
 
 
 
 
 
 
 
 
 
 
8128172
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
---
license: mit
pipeline_tag: graph-ml
tags:
- graphs
- ultra
- knowledge graph
---
 
## Description
ULTRA is a foundation model for knowledge graph (KG) reasoning. A single pre-trained ULTRA model performs link prediction tasks on **any** multi-relational graph with any entity / relation vocabulary. Performance-wise averaged on 50+ KGs, a single pre-trained ULTRA model is better in the **0-shot** inference mode than many SOTA models trained specifically on each graph. Following the pretrain-finetune paradigm of foundation models, you can run a pre-trained ULTRA checkpoint **immediately in the zero-shot manner** on any graph as well as **use more fine-tuning**.

ULTRA provides **unified, learnable, transferable** representations for any KG. Under the hood, ULTRA employs graph neural networks and modified versions of NBFNet. ULTRA does not learn any entity and relation embeddings specific to a downstream graph but instead obtains relative relation representations based on interactions between relations.

arxiv:  https://arxiv.org/abs/2310.04562  
GitHub: https://github.com/DeepGraphLearning/ULTRA  


## Checkpoints
Here on HuggingFace, we provide 3 pre-trained ULTRA checkpoints (all ~169k params) varying by the amount of pre-training data.

| Model | Training KGs |
| ------| --------------|
| [ultra_3g](https://huggingface.co/mgalkin/ultra_3g) | 3 graphs |
| [ultra_4g](https://huggingface.co/mgalkin/ultra_4g) | 4 graphs |
| [ultra_50g](https://huggingface.co/mgalkin/ultra_50g) | 50 graphs |

* [ultra_3g](https://huggingface.co/mgalkin/ultra_3g) and [ultra_4g](https://huggingface.co/mgalkin/ultra_4g) are the PyG models reported in the github repo;  
* [ultra_50g](https://huggingface.co/mgalkin/ultra_50g) is a new ULTRA checkpoint pre-trained on 50 different KGs (transductive and inductive) for 1M steps to maximize the performance on any unseen downstream KG.

## ⚡️ Your Superpowers 

ULTRA performs **link prediction** (KG completion aka reasoning): given a query `(head, relation, ?)`, it ranks all nodes in the graph as potential `tails`. 


1. Install the dependencies as listed in the Installation instructions on the [GitHub repo](https://github.com/DeepGraphLearning/ULTRA#installation).
2. Clone this model repo to find the `UltraForKnowledgeGraphReasoning` class in `modeling.py` and load the checkpoint (all the necessary model code is in this model repo as well).

* Run **zero-shot inference** on any graph:

```python
from modeling import UltraForKnowledgeGraphReasoning
from ultra.datasets import CoDExSmall
from ultra.eval import test
model = UltraForKnowledgeGraphReasoning.from_pretrained("mgalkin/ultra_4g")
dataset = CoDExSmall(root="./datasets/")
test(model, mode="test", dataset=dataset, gpus=None)
# Expected results for ULTRA 4g
# mrr:      0.464
# hits@10:  0.666
```

Or with `AutoModel`:

```python
from transformers import AutoModel
from ultra.datasets import CoDExSmall
from ultra.eval import test
model = AutoModel.from_pretrained("mgalkin/ultra_4g", trust_remote_code=True)
dataset = CoDExSmall(root="./datasets/")
test(model, mode="test", dataset=dataset, gpus=None)
# Expected results for ULTRA 4g
# mrr:      0.464
# hits@10:  0.666
```

* You can also **fine-tune** ULTRA on each graph, please refer to the [github repo](https://github.com/DeepGraphLearning/ULTRA#run-inference-and-fine-tuning) for more details on training / fine-tuning   
* The model code contains 57 different KGs, please refer to the [github repo](https://github.com/DeepGraphLearning/ULTRA#datasets) for more details on what's available.   

## Performance

**Averaged zero-shot performance of ultra-3g and ultra-4g**
<table>
    <tr>
        <th rowspan=2 align="center">Model </th>
        <th colspan=2 align="center">Inductive (e) (18 graphs)</th>
        <th colspan=2 align="center">Inductive (e,r) (23 graphs)</th>
        <th colspan=2 align="center">Transductive (16 graphs)</th>
    </tr>
    <tr>
        <th align="center"> Avg MRR</th>
        <th align="center"> Avg Hits@10</th>
        <th align="center"> Avg MRR</th>
        <th align="center"> Avg Hits@10</th>
        <th align="center"> Avg MRR</th>
        <th align="center"> Avg Hits@10</th>
    </tr>
    <tr>
        <th>ULTRA (3g) PyG</th>
        <td align="center">0.420</td>
        <td align="center">0.562</td>
        <td align="center">0.344</td>
        <td align="center">0.511</td>
        <td align="center">0.329</td>
        <td align="center">0.479</td>
    </tr>
    <tr>
        <th>ULTRA (4g) PyG</th>
        <td align="center">0.444</td>
        <td align="center">0.588</td>
        <td align="center">0.344</td>
        <td align="center">0.513</td>
        <td align="center">WIP</td>
        <td align="center">WIP</td>
    </tr>
   <tr>
        <th>ULTRA (50g) PyG (pre-trained on 50 KGs)</th>
        <td align="center">0.444</td>
        <td align="center">0.580</td>
        <td align="center">0.395</td>
        <td align="center">0.554</td>
        <td align="center">0.389</td>
        <td align="center">0.549</td>
    </tr>
</table>
Fine-tuning ULTRA on specific graphs brings, on average, further 10% relative performance boost both in MRR and Hits@10. See the paper for more comparisons.

**ULTRA 50g Performance**

ULTRA 50g was pre-trained on 50 graphs, so we can't really apply the zero-shot evaluation protocol to the graphs. 
However, we can compare with Supervised SOTA models trained from scratch on each dataset:

| Model | Avg MRR, Transductive graphs (16)| Avg Hits@10, Transductive graphs (16)|
| ----- | ---------------------------------| -------------------------------------|
| Supervised SOTA models   | 0.371 | 0.511 |
| ULTRA 50g (single model) | **0.389** | **0.549** |

That is, instead of training a big KG embedding model on your graph, you might want to consider running ULTRA (any of the checkpoints) as its performance might already be higher 🚀

## Useful links

Please report the issues in the [official GitHub repo of ULTRA](https://github.com/DeepGraphLearning/ULTRA)