Ascol57 commited on
Commit
8f5d160
·
verified ·
1 Parent(s): 9afd745

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -141
README.md CHANGED
@@ -1,141 +1,5 @@
1
- # Give your local LLM the ability to search the web!
2
- This project gives local LLMs the ability to search the web by outputting a specific
3
- command. Once the command has been found in the model output using a regular expression,
4
- [duckduckgo-search](https://pypi.org/project/duckduckgo-search/)
5
- is used to search the web and return a number of result pages. Finally, an
6
- ensemble of LangChain's [Contextual compression](https://python.langchain.com/docs/modules/data_connection/retrievers/contextual_compression/) and
7
- [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) (Or alternatively, [SPLADE](https://github.com/naver/splade))
8
- is used to extract the relevant parts (if any) of each web page in the search results
9
- and the results are appended to the model's output.
10
- ![llm_websearch](https://github.com/mamei16/LLM_Web_search/assets/25900898/f9d2d83c-e3cf-4f69-91c2-e9c3fe0b7d89)
11
-
12
-
13
- * **[Table of Contents](#table-of-contents)**
14
- * [Installation](#installation)
15
- * [Usage](#usage)
16
- + [Using a custom regular expression](#using-a-custom-regular-expression)
17
- + [Reading web pages](#reading-web-pages)
18
- * [Search backends](#search-backends)
19
- + [DuckDuckGo](#duckduckgo)
20
- + [SearXNG](#searxng)
21
- + [Search parameters](#search-parameters)
22
- * [Keyword retrievers](#keyword-retrievers)
23
- + [Okapi BM25](#okapi-bm25)
24
- + [SPLADE](#splade)
25
- * [Recommended models](#recommended-models)
26
-
27
- ## Installation
28
- 1. Go to the "Session" tab of the web UI and use "Install or update an extension"
29
- to download the latest code for this extension.
30
- 2. To install the extension's depencies you have two options:
31
- 1. **The easy way:** Run the appropriate `update_wizard` script inside the text-generation-webui folder
32
- and choose `Install/update extensions requirements`. This installs everything using `pip`,
33
- which means using the unofficial `faiss-cpu` package. Therefore, it is not guaranteed to
34
- work with your system (see [the official disclaimer](https://github.com/facebookresearch/faiss/wiki/Installing-Faiss#why-dont-you-support-installing-via-xxx-)).
35
- 2. **The safe way:** Manually update the conda environment in which you installed the dependencies of
36
- [oobabooga's text-generation-webui](https://github.com/oobabooga/text-generation-webui).
37
- Open the subfolder `text-generation-webui/extensions/LLM_Web_search` in a terminal or conda shell.
38
- If you used the one-click install method, run the command
39
- `conda env update -p <path_to_your_environment> --file environment.yml`,
40
- where you need to replace `<path_to_your_environment>` with the path to the
41
- `/installer_files/env` subfolder within the text-generation-webui folder.
42
- Otherwise, if you made your own environment,
43
- use `conda env update -n <name_of_your_environment> --file environment.yml`
44
- (NB: Solving the environment can take a while)
45
- 3. Launch the Web UI with:
46
- ```python server.py --extension LLM_Web_search```
47
-
48
- If the installation was successful and the extension was loaded, a new tab with the
49
- title "LLM Web Search" should be visible in the web UI.
50
-
51
- See https://github.com/oobabooga/text-generation-webui/wiki/07-%E2%80%90-Extensions for more
52
- information about extensions.
53
-
54
- ## Usage
55
-
56
- Search queries are extracted from the model's output using a regular expression. This is made easier by prompting the model
57
- to use a fixed search command (see `system_prompts/` for example prompts).
58
- Currently, only a single search query per model chat message is supported.
59
-
60
- An example workflow of using this extension could be:
61
- 1. Load a model
62
- 2. Load a matching instruction template
63
- 3. Head over to the "LLM Web search" tab
64
- 4. Load a custom system message/prompt
65
- 5. Ensure that the query part of the command mentioned in the system message
66
- can be matched using the current "Search command regex string"
67
- (see "Using a custom regular expression" below)
68
- 6. Pick a hyperparameter generation preset that works well for you.
69
- 7. Choose "chat-instruct" or "instruct" mode and start chatting
70
-
71
- ### Using a custom regular expression
72
- The default regular expression is:
73
- ```regexp
74
- Search_web\("(.*)"\)
75
- ```
76
- Where `Search_web` is the search command and everything between the quotation marks
77
- inside the parentheses will be used as the search query. Every custom regular expression must use a
78
- [capture group](https://www.regular-expressions.info/brackets.html) to extract the search
79
- query. I recommend https://www.debuggex.com/ to try out custom regular expressions. If a regex
80
- fulfills the requirement above, the search query should be matched by "Group 1" in Debuggex.
81
-
82
- Here is an example of a more flexible, but more complex, regex that works for several
83
- different models:
84
- ```regexp
85
- [Ss]earch_web\((?:["'])(.*)(?:["'])\)
86
- ```
87
- ### Reading web pages
88
- Experimental support exists for extracting the full text content from a webpage. The default regex to use this
89
- functionality is:
90
- ```regexp
91
- Open_url\("(.*)"\)
92
- ```
93
- **Note**: The full content of a web page is likely to exceed the maximum context length of your average local LLM.
94
- ## Search backends
95
-
96
- ### DuckDuckGo
97
- This is the default web search backend.
98
-
99
- ### SearXNG
100
-
101
- Rudimentary support exists for SearXNG. To use a local or remote
102
- SearXNG instance instead of DuckDuckGo, simply paste the URL into the
103
- "SearXNG URL" text field of the "LLM Web Search" settings tab. The instance must support
104
- returning results in JSON format.
105
-
106
- #### Search parameters
107
- To modify the categories, engines, languages etc. that should be used for a
108
- specific query, it must follow the
109
- [SearXNG search syntax](https://docs.searxng.org/user/search-syntax.html). Currently,
110
- automatic redirect and Special Queries are not supported.
111
-
112
-
113
- ## Keyword retrievers
114
- ### Okapi BM25
115
- This extension comes out of the box with
116
- [Okapi BM25](https://en.wikipedia.org/wiki/Okapi_BM25) enabled, which is widely used and very popuplar
117
- for keyword based document retrieval. It runs on the CPU and,
118
- for the purpose of this extension, it is fast.
119
- ### SPLADE
120
- If you don't run the extension in "CPU only" mode and have some VRAM to spare,
121
- you can also select [SPLADE](https://github.com/naver/splade) in the "Advanced settings" section
122
- as an alternative. It has been [shown](https://arxiv.org/pdf/2207.03834.pdf) to outperform BM25 in multiple benchmarks
123
- and uses a technique called "query expansion" to add additional contextually relevant words
124
- to the original query. However, it is slower than BM25. You can read more about it [here](https://www.pinecone.io/learn/splade/).
125
- To use SPLADE, you have to install the additional dependency [qdrant-client](https://github.com/qdrant/qdrant-client).
126
- Simply activate the conda environment of the main web UI and run
127
- `pip install qdrant-client`.
128
- To improve performance, documents are embedded in batches and in parallel. Increasing the
129
- "SPLADE batch size" parameter setting improves performance up to a certain point,
130
- but VRAM usage ramps up quickly with increasing batch size. A batch size of 8 appears
131
- to be a good trade-off, but the default value is 2 to avoid running out of memory on smaller
132
- GPUs.
133
-
134
- ## Recommended models
135
- If you (like me) have ≤ 12 GB VRAM, I recommend using
136
- [Llama-3-8B-instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct).
137
- You can find a matching instruction template in the extension's `instruction_templates`
138
- folder. Simply copy it to the main web UI's `instruction-templates` folder.
139
- **Note:** Several existing GGUF versions have a stop token issue, which can be solved by [editing the file's
140
- metadata](https://www.reddit.com/r/LocalLLaMA/comments/1c7dkxh/tutorial_how_to_make_llama3instruct_ggufs_less/). A GGUF version where this issue has already been fixed can be found
141
- [here](https://huggingface.co/AI-Engine/Meta-Llama-3-8B-Instruct-GGUF/blob/main/Meta-Llama-3-8B-Instruct.Q5_k_m_with_temp_stop_token_fix.gguf).
 
1
+ ---
2
+ title: LLMsearch
3
+ emoji: 💬
4
+ colorFrom: yellow
5
+ colorTo: purple