kevinkrahn
commited on
Commit
•
83c5c5a
1
Parent(s):
c11b0cc
Update README.md
Browse files
README.md
CHANGED
@@ -7,7 +7,11 @@ tags:
|
|
7 |
- sentence-similarity
|
8 |
- transformers
|
9 |
- semantic-search
|
10 |
-
|
|
|
|
|
|
|
|
|
11 |
---
|
12 |
|
13 |
# shlm-grc-en
|
@@ -16,9 +20,9 @@ tags:
|
|
16 |
|
17 |
This model creates sentence embeddings in a shared vector space for Ancient Greek and English text.
|
18 |
|
19 |
-
The base model uses a modified version of the HLM architecture described in [Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers](https://aclanthology.org/2024.sigtyp-1.16/)
|
20 |
|
21 |
-
This model is trained to produce sentence embeddings using the multilingual knowledge distillation method and datasets described in [Sentence Embedding Models for Ancient Greek Using Multilingual Knowledge Distillation](https://aclanthology.org/2023.alp-1.2/).
|
22 |
|
23 |
This model was distilled from `BAAI/bge-base-en-v1.5` for embedding English and Ancient Greek text.
|
24 |
|
@@ -78,6 +82,8 @@ print(sentence_embeddings)
|
|
78 |
|
79 |
## Citing & Authors
|
80 |
|
|
|
|
|
81 |
```
|
82 |
@inproceedings{riemenschneider-krahn-2024-heidelberg,
|
83 |
title = "Heidelberg-Boston @ {SIGTYP} 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers",
|
|
|
7 |
- sentence-similarity
|
8 |
- transformers
|
9 |
- semantic-search
|
10 |
+
- character-transformer
|
11 |
+
- hierarchical-transformer
|
12 |
+
language:
|
13 |
+
- en
|
14 |
+
- grc
|
15 |
---
|
16 |
|
17 |
# shlm-grc-en
|
|
|
20 |
|
21 |
This model creates sentence embeddings in a shared vector space for Ancient Greek and English text.
|
22 |
|
23 |
+
The base model uses a modified version of the HLM architecture described in [Heidelberg-Boston @ SIGTYP 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers](https://aclanthology.org/2024.sigtyp-1.16/) ([arXiv](https://arxiv.org/abs/2405.20145))
|
24 |
|
25 |
+
This model is trained to produce sentence embeddings using the multilingual knowledge distillation method and datasets described in [Sentence Embedding Models for Ancient Greek Using Multilingual Knowledge Distillation](https://aclanthology.org/2023.alp-1.2/) ([arXiv](https://arxiv.org/abs/2308.13116)).
|
26 |
|
27 |
This model was distilled from `BAAI/bge-base-en-v1.5` for embedding English and Ancient Greek text.
|
28 |
|
|
|
82 |
|
83 |
## Citing & Authors
|
84 |
|
85 |
+
If you use this model please cite the following papers:
|
86 |
+
|
87 |
```
|
88 |
@inproceedings{riemenschneider-krahn-2024-heidelberg,
|
89 |
title = "Heidelberg-Boston @ {SIGTYP} 2024 Shared Task: Enhancing Low-Resource Language Analysis With Character-Aware Hierarchical Transformers",
|