Spaces:
Running
Running
Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,17 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
To test your RAG solution it would be powerful to have access to a dataset that consists of a text corpus,
|
11 |
+
correct responses to queries (e.g. question-answer) to test the solution end-to-end and maybe even a set of relevant passages
|
12 |
+
from the text corpus for each query to test the retrieval component separately as well.
|
13 |
+
We call this a question-answer-passages dataset.
|
14 |
+
|
15 |
+
There are plenty of large-scale datasets of this kind such as [Google's Natural Questions](https://ai.google.com/research/NaturalQuestions/).
|
16 |
+
|
17 |
+
Still we lack such datasets that are **small-scale** and **narrow-domain** to just test our RAG solution quickly or to see how it performs
|
18 |
+
in a certain domain context.
|
19 |
+
|
20 |
+
We created this space to create a collections of such datasets to boost the developement of RAG solutions.
|
21 |
+
|
22 |
+
Datasets consist of:
|
23 |
+
* asdf
|