zstanjj commited on
Commit
953999f
β€’
1 Parent(s): f4c3753

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -8
README.md CHANGED
@@ -7,17 +7,12 @@ license: apache-2.0
7
  ---
8
 
9
 
10
-
11
- ## ✨ Latest News
12
-
13
- - [11/06/2024]: Our paper is available on arXiv. You can access it [here](https://arxiv.org/abs/2411.02959).
14
- - [11/05/2024]: The open-source toolkit and models are released. You can apply HtmlRAG in your own RAG systems now.
15
-
16
-
17
  ## Model Information
18
 
 
 
19
  <p align="left">
20
- β€’ πŸ“ <a href="https://arxiv.org/abs/2411.02959" target="_blank">Paper</a> β€’ πŸ€— <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> β€’ 🧩 <a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
21
  </p>
22
 
23
  We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.
 
7
  ---
8
 
9
 
 
 
 
 
 
 
 
10
  ## Model Information
11
 
12
+ We release the HTML pruner model used in **HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieval Results in RAG Systems**.
13
+
14
  <p align="left">
15
+ Useful links: πŸ“ <a href="https://arxiv.org/abs/2411.02959" target="_blank">Paper</a> β€’ πŸ€— <a href="https://huggingface.co/zstanjj/SlimPLM-Query-Rewriting/" target="_blank">Hugging Face</a> β€’ 🧩 <a href="https://github.com/plageon/SlimPLM" target="_blank">Github</a>
16
  </p>
17
 
18
  We propose HtmlRAG, which uses HTML instead of plain text as the format of external knowledge in RAG systems. To tackle the long context brought by HTML, we propose **Lossless HTML Cleaning** and **Two-Step Block-Tree-Based HTML Pruning**.