GGUF
conversational
krasserm commited on
Commit
2cca08d
·
verified ·
1 Parent(s): b05e5a4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +91 -3
README.md CHANGED
@@ -1,3 +1,91 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - krasserm/gba-trajectories
5
+ ---
6
+ A planner LLM [fine-tuned on synthetic trajectories](https://krasserm.github.io/2024/05/31/planner-fine-tuning/) from an agent simulation. It can be used in [ReAct](https://arxiv.org/abs/2210.03629)-style LLM agents where [planning is separated from function calling](https://krasserm.github.io/2024/03/06/modular-agent/). Trajectory generation and planner fine-tuning are described in the [bot-with-plan](https://github.com/krasserm/bot-with-plan) project.
7
+
8
+ The planner has been fine-tuned on the [krasserm/gba-trajectories](https://huggingface.co/datasets/krasserm/gba-trajectories) dataset with a [loss over the completion only](https://github.com/krasserm/bot-with-plan/tree/master/train#gba-planner-7b-v02) (i.e. no loss over the prompt). The original QLoRA model is available at [krasserm/gba-planner-7B-completion-only-v0.2](https://huggingface.co/krasserm/gba-planner-7B-completion-only-v0.2).
9
+
10
+ ## Server setup
11
+
12
+ Download model:
13
+
14
+ ```shell
15
+ mkdir -p models
16
+
17
+ wget https://huggingface.co/krasserm/gba-planner-7B-completion-only-v0.2-GGUF/resolve/main/gba-planner-7B-completion-only-v0.2-Q8_0.gguf?download=true \
18
+ -O models/gba-planner-7B-completion-only-v0.2-Q8_0.gguf
19
+ ```
20
+
21
+ Start llama.cpp server:
22
+
23
+ ```shell
24
+ docker run --gpus all --rm -p 8082:8080 -v $(realpath models):/models ghcr.io/ggerganov/llama.cpp:server-cuda--b1-17b291a \
25
+ -m /models/gba-planner-7B-completion-only-v0.2-Q8_0.gguf -c 1024 --n-gpu-layers 33 --host 0.0.0.0 --port 8080
26
+ ```
27
+
28
+ ## Usage example
29
+
30
+ Create a `planner` instance on the client side.
31
+
32
+ ```python
33
+ import json
34
+ from gba.client import ChatClient, LlamaCppClient, MistralInstruct
35
+ from gba.planner import FineTunedPlanner
36
+ from gba.utils import Scratchpad
37
+
38
+ llm = LlamaCppClient(url="http://localhost:8082/completion")
39
+ model = MistralInstruct(llm=llm)
40
+ client = ChatClient(model=model)
41
+ planner = FineTunedPlanner(client=client)
42
+ ```
43
+
44
+ Define a user `request` and past task-observation pairs (`scratchpad`) of the current trajectory.
45
+
46
+ ```python
47
+ request = "Get the average Rotten Tomatoes scores for DreamWorks' last 5 movies."
48
+ scratchpad = Scratchpad()
49
+ scratchpad.add(
50
+ task="Find the last 5 movies released by DreamWorks.",
51
+ result="The last five movies released by DreamWorks are \"The Bad Guys\" (2022), \"Boss Baby: Family Business\" (2021), \"Trolls World Tour\" (2020), \"Abominable\" (2019), and \"How to Train Your Dragon: The Hidden World\" (2019).")
52
+ scratchpad.add(
53
+ task="Search the internet for the Rotten Tomatoes score of \"The Bad Guys\" (2022)",
54
+ result="The Rotten Tomatoes score of \"The Bad Guys\" (2022) is 88%.",
55
+ )
56
+ ```
57
+
58
+ Then generate a plan for the next step in the trajectory.
59
+
60
+ ```python
61
+ result = planner.plan(request=request, scratchpad=scratchpad)
62
+ print(json.dumps(result.to_dict(), indent=2))
63
+ ```
64
+
65
+ ```json
66
+ {
67
+ "context_information_summary": "The last five movies released by DreamWorks are \"The Bad Guys\" (2022), \"Boss Baby: Family Business\" (2021), \"Trolls World Tour\" (2020), \"Abominable\" (2019), and \"How to Train Your Dragon: The Hidden World\" (2019). The Rotten Tomatoes score of \"The Bad Guys\" (2022) is 88%.",
68
+ "thoughts": "Since we already have the Rotten Tomatoes score for \"The Bad Guys\", the next logical step is to find the scores for the remaining movies in the list, starting with \"Boss Baby: Family Business\".",
69
+ "task": "Search the internet for the Rotten Tomatoes score of \"Boss Baby: Family Business\" (2021).",
70
+ "selected_tool": "search_internet"
71
+ }
72
+ ```
73
+
74
+ The planner selects a tool and generates a task for the next step. The task is tool-specific and executed by the tool, in this case the [search_internet](https://github.com/krasserm/bot-with-plan/tree/master/gba/tools/search#search-internet-tool) tool, which results in the next observation on the trajectory. If the `final_answer` tool is selected, a final answer is available or can be generated from the trajectory. The output JSON schema is enforced by the `planner` via [constrained decoding](https://krasserm.github.io/2023/12/18/llm-json-mode/) on the llama.cpp server.
75
+
76
+ ## Tools
77
+
78
+ The planner learned a (static) set of available tools during fine-tuning. These are:
79
+
80
+ | Tool name | Tool description |
81
+ |--------------------|-------------------------------------------------------------------------------------------|
82
+ | `ask_user` | Useful for asking user about information missing in the request. |
83
+ | `calculate_number` | Useful for numerical tasks that result in a single number. |
84
+ | `create_event` | Useful for adding a single entry to my calendar at given date and time. |
85
+ | `search_wikipedia` | Useful for searching factual information in Wikipedia. |
86
+ | `search_internet` | Useful for up-to-date information on the internet. |
87
+ | `send_email` | Useful for sending an email to a single recipient. |
88
+ | `use_bash` | Useful for executing commands in a Linux bash. |
89
+ | `final_answer` | Useful for providing the final answer to a request. Must always be used in the last step. |
90
+
91
+ The framework provided by the [bot-with-plan](https://github.com/krasserm/bot-with-plan) project can easily be adjusted to a different set of tools for specialization to other application domains.