jaspercatapang
commited on
Commit
•
f177d8c
1
Parent(s):
d58d533
Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ inference: false
|
|
12 |
<img src="logo.png" width=25%>
|
13 |
|
14 |
# Model Description
|
15 |
-
RoBERTA ReRanker for Retrieved Results or **R*** (pronounced R-star) is an advanced model designed to enhance search results' relevance and accuracy through reranking. By integrating the retrieval capabilities of **R*** with generative models, this hybrid approach significantly enhances the relevance and contextual depth of search results. Based on the [RoBERTa tiny](https://huggingface.co/haisongzhang/roberta-tiny-cased) architecture, **R*** is specialized in distinguishing relevant from irrelevant query-passage pairs, thereby refining the output of LLMs in retrieval and generative tasks.
|
16 |
|
17 |
## Training Data
|
18 |
R* was trained on a dataset derived from the MS MARCO passage ranking dataset, consisting of 2.5 million query-positive passage pairs and an equal number of query-negative passage pairs, totaling 5 million query-passage pairs. This ensures a balanced training approach, exposing R* to both relevant and irrelevant examples equally.
|
|
|
12 |
<img src="logo.png" width=25%>
|
13 |
|
14 |
# Model Description
|
15 |
+
RoBERTA ReRanker for Retrieved Results or **R*** (pronounced R-star) is an advanced model designed to enhance search results' relevance and accuracy through reranking. By integrating the retrieval capabilities of **R*** with generative models, this hybrid approach significantly enhances the relevance and contextual depth of search results. Based on the [RoBERTa tiny](https://huggingface.co/haisongzhang/roberta-tiny-cased) architecture, **R*** is specialized in distinguishing relevant from irrelevant query-passage pairs, thereby refining the output of LLMs in retrieval and generative tasks. This model is an experiment featured and presented in [PACLIC 38 (2024)](https://sites.google.com/view/paclic38), which would be published in the ACL Anthology.
|
16 |
|
17 |
## Training Data
|
18 |
R* was trained on a dataset derived from the MS MARCO passage ranking dataset, consisting of 2.5 million query-positive passage pairs and an equal number of query-negative passage pairs, totaling 5 million query-passage pairs. This ensures a balanced training approach, exposing R* to both relevant and irrelevant examples equally.
|