steffenc commited on
Commit
1f6940d
·
unverified ·
2 Parent(s): 89b6d9e dbcaf0f

Merge pull request #3 from macrocosm-os/features/random-stream

Browse files
README.md CHANGED
@@ -1,18 +1,125 @@
1
- # chattensor-backend
2
- Backend for Chattensor app
3
 
4
- To run, you will need a bittensor wallet which is registered to the relevant subnet (1@mainnet or 61@testnet).
 
 
 
5
 
 
 
 
 
6
 
7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8
 
9
  ## Install
10
- Create a new python environment and install the dependencies with the command
11
 
 
12
  ```bash
 
 
13
  pip install -r requirements.txt
14
  ```
15
 
 
 
16
  > Note: Currently the prompting library is only installable on machines with cuda devices (NVIDIA-GPU).
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
 
 
 
1
 
2
+ <picture>
3
+ <source srcset="./assets/macrocosmos-white.png" media="(prefers-color-scheme: dark)">
4
+ <img src="macrocosmos-white.png">
5
+ </picture>
6
 
7
+ <picture>
8
+ <source srcset="./assets/macrocosmos-black.png" media="(prefers-color-scheme: light)">
9
+ <img src="macrocosmos-black.png">
10
+ </picture>
11
 
12
 
13
+ <br/>
14
+ <br/>
15
+ <br/>
16
+
17
+ # Subnet 1 API
18
+ > Note: This project is still in development and is not yet ready for production use.
19
+
20
+ The official REST API for Bittensor's flagship subnet 1 ([prompting](https://github.com/opentensor/prompting)), built by [Macrocosmos](https://macrocosmos.ai).
21
+
22
+ Subnet 1 is an decentralized open source network containing around 1000 highly capable LLM agents. These agents are capable of performing a wide range of tasks, from simple math problems to complex natural language processing tasks. As subnet 1 is constantly evolving, its capabilities are always expanding. Our goal is to provide a world-class inference engine, to be used by developers and researchers alike.
23
+
24
+ This API is designed to power applications and facilitate the interaction between subnets by providing a simple and easy-to-use interface for developers which enables:
25
+ 1. **Conversation**: Chatting with the network (streaming and non-streaming)
26
+ 2. **Data cleaning**: Filtering empty and otherwise useless responses
27
+ 3. **Advanced inference**: Providing enhanced responses using SOTA ensembling techniques (WIP)
28
+
29
+ Validators can use this API to interact with the network and perform various tasks.
30
+ To run an API server, you will need a bittensor wallet which is registered as a validator the relevant subnet (1@mainnet or 61@testnet).
31
+
32
+ NOTE: At present, miners are choosing not to stream their responses to the network. This means that the server will not be able to provide a streamed response to the client until the miner has finished processing the request. This is a temporary measure and will be resolved in the future.
33
+
34
+ ## How it works
35
+ The API server is a RESTful API that provides endpoints for interacting with the network. It is a simple [wrapper](./validators/sn1_validator_wrapper.py) around your subnet 1 validator, which makes use of the dendrite to make queries.
36
 
37
  ## Install
38
+ Create a new python environment and install the dependencies with the command.
39
 
40
+ (First time only)
41
  ```bash
42
+ python3.10 -m venv env
43
+ source env/bin/activate
44
  pip install -r requirements.txt
45
  ```
46
 
47
+ > Note: This project requires python >=3.10.
48
+
49
  > Note: Currently the prompting library is only installable on machines with cuda devices (NVIDIA-GPU).
50
 
51
+ ## Run
52
+
53
+ First activate the virtual environment and then run the following command to start the server:
54
+
55
+ ```bash
56
+ source env/bin/activate
57
+ ```
58
+
59
+ Run an API server on subnet 1 with the following command:
60
+
61
+ ```bash
62
+ EXPECTED_ACCESS_KEY=<ACCESS_KEY> python server.py --wallet.name <WALLET_NAME> --wallet.hotkey <WALLET_HOTKEY> --netuid <NETUID> --neuron.model_id mock --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
63
+ ```
64
+
65
+ The command ensures that no GPU memory is used by the server, and that the large models used by the incentive mechanism are not loaded.
66
+
67
+ > Note: This command is subject to change as the project evolves.
68
+
69
+ We recommend that you run the server using a process manager like PM2. This will ensure that the server is always running and will restart if it crashes.
70
+
71
+ ```bash
72
+ EXPECTED_ACCESS_KEY=<ACCESS_KEY> pm2 start server.py --interpreter python3 --name sn1-api -- --wallet.name <WALLET_NAME> --wallet.hotkey <WALLET_HOTKEY> --netuid <NETUID> --neuron.model_id mock --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
73
+ ```
74
+
75
+ ## API Usage
76
+ At present, the API provides two endpoints: `/chat` (live) and `/echo` (test).
77
+
78
+ `/chat` is used to chat with the network and receive a response. The endpoint requires a JSON payload with the following fields:
79
+ - `k: int`: The number of responses to return
80
+ - `timeout: float`: The time in seconds to wait for a response
81
+ - `roles: List[str]`: The roles of the agents to query
82
+ - `messages: List[str]`: The messages to send to the network
83
+ - `prefer: str`: The preferred response to use as the default view. Should be one of `{'longest', 'shortest'}`
84
+
85
+ Responses from the `/chat` endpoint are streamed back to the client as they are received from the network. Upon completion, the server will return a JSON response with the following fields:
86
+ - `streamed_chunks: List[str]`: The streamed responses from the network
87
+ - `streamed_chunks_timings: List[float]`: The time taken to receive each streamed response
88
+ - `synapse: StreamPromptingSynapse`: The synapse used to query the network. This contains full context and metadata about the query.
89
+
90
+
91
+ ## Testing
92
+
93
+ To test the API locally, you can use the following curl command:
94
+
95
+ ```bash
96
+ curl --no-buffer -X POST http://0.0.0.0:10000/chat/ -H "api_key: <ACCESS_KEY>" -d '{"k": 5, "timeout": 15, "roles": ["user"], "messages": ["What is today's date?"]}'
97
+ """
98
+ ```
99
+ > Note: Use the `--no-buffer` flag to ensure that the response is streamed back to the client.
100
+
101
+ After verifying that the server is responding to requests locally, you can test the server on a remote machine.
102
+
103
+ ### Troubleshooting
104
+
105
+ If you do not receive a response from the server, check that the server is running and that the port is open on the server. You can open the port using the following commands:
106
+
107
+ ```bash
108
+ sudo ufw allow 10000/tcp
109
+ ```
110
+
111
+ ---
112
+
113
+ ## Contributing
114
+ If you would like to contribute to the project, please read the [CONTRIBUTING.md](CONTRIBUTING.md) file for more information.
115
+
116
+ You can find out more about the project by visiting the [Macrocosmos website](https://macrocosmos.ai) or by joining us in our social channels:
117
+
118
+
119
+ ![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?style=for-the-badge&logo=discord&logoColor=white)
120
+ [![Substack](https://img.shields.io/badge/Substack-%23006f5c.svg?style=for-the-badge&logo=substack&logoColor=FF6719)](https://substack.com/@macrocosmosai)
121
+ [![Twitter](https://img.shields.io/badge/Twitter-%231DA1F2.svg?style=for-the-badge&logo=twitter&logoColor=white)](https://twitter.com/MacrocosmosAI)
122
+ [![X](https://img.shields.io/badge/X-%23000000.svg?style=for-the-badge&logo=X&logoColor=white)](https://twitter.com/MacrocosmosAI)
123
+ [![LinkedIn](https://img.shields.io/badge/LinkedIn-0077B5?logo=linkedin&logoColor=white)](www.linkedin.com/in/MacrocosmosAI)
124
+ [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
125
 
assets/macrocosmos-black.png ADDED
assets/macrocosmos-white.png ADDED
responses.py ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ from pydantic import BaseModel, Field
2
+ from typing import List, Dict, Any
3
+
4
+
5
+ class TextStreamResponse(BaseModel):
6
+ streamed_chunks: List[str] = Field(
7
+ default_factory=list, description="List of streamed chunks."
8
+ )
9
+ streamed_chunks_timings: List[float] = Field(
10
+ default_factory=list, description="List of streamed chunks timings, in seconds."
11
+ )
12
+ uid: int = Field(0, description="UID of queried miner")
13
+ completion: str = Field(
14
+ "", description="The final completed string from the stream."
15
+ )
16
+ timing: float = Field(
17
+ 0, description="Timing information of all request, in seconds."
18
+ )
19
+
20
+ def to_dict(self):
21
+ return {
22
+ "streamed_chunks": self.streamed_chunks,
23
+ "streamed_chunks_timings": self.streamed_chunks_timings,
24
+ "uid": self.uid,
25
+ "completion": self.completion,
26
+ "timing": self.timing,
27
+ }
server.py CHANGED
@@ -2,41 +2,11 @@ import asyncio
2
  import utils
3
  import bittensor as bt
4
  from aiohttp import web
5
- from aiohttp.web_response import Response
6
  from validators import S1ValidatorAPI, QueryValidatorParams, ValidatorAPI
7
  from middlewares import api_key_middleware, json_parsing_middleware
8
 
9
- """
10
- # test
11
- ```
12
- curl -X POST http://0.0.0.0:10000/chat/ -H "api_key: hello" -d '{"k": 5, "timeout": 3, "roles": ["user"], "messages": ["hello world"]}'
13
-
14
- curl -X POST http://0.0.0.0:10000/chat/ -H "api_key: hey-michal" -d '{"k": 5, "timeout": 3, "roles": ["user"], "messages": ["on what exact date did the 21st century begin?"]}'
15
-
16
- # stream
17
- curl --no-buffer -X POST http://129.146.127.82:10000/echo/ -H "api_key: hey-michal" -d '{"k": 3, "timeout": 0.2, "roles": ["user"], "messages": ["i need to tell you something important but first"]}'
18
- ```
19
-
20
- TROUBLESHOOT
21
- check if port is open
22
- ```
23
- sudo ufw allow 10000/tcp
24
- sudo ufw allow 10000/tcp
25
- ```
26
- # run
27
- ```
28
- EXPECTED_ACCESS_KEY="hey-michal" pm2 start app.py --interpreter python3 --name app -- --neuron.model_id mock --wallet.name sn1 --wallet.hotkey v1 --netuid 1 --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
29
- ```
30
-
31
- basic testing
32
- ```
33
- EXPECTED_ACCESS_KEY="hey-michal" python app.py --neuron.model_id mock --wallet.name sn1 --wallet.hotkey v1 --netuid 1 --neuron.tasks math --neuron.task_p 1 --neuron.device cpu
34
- ```
35
- add --mock to test the echo stream
36
- """
37
-
38
-
39
- async def chat(request: web.Request) -> Response:
40
  """
41
  Chat endpoint for the validator.
42
  """
@@ -49,9 +19,8 @@ async def chat(request: web.Request) -> Response:
49
  return response
50
 
51
 
52
- async def echo_stream(request, request_data):
53
- request_data = request["data"]
54
- return await utils.echo_stream(request_data)
55
 
56
 
57
  class ValidatorApplication(web.Application):
 
2
  import utils
3
  import bittensor as bt
4
  from aiohttp import web
 
5
  from validators import S1ValidatorAPI, QueryValidatorParams, ValidatorAPI
6
  from middlewares import api_key_middleware, json_parsing_middleware
7
 
8
+
9
+ async def chat(request: web.Request) -> web.StreamResponse:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
10
  """
11
  Chat endpoint for the validator.
12
  """
 
19
  return response
20
 
21
 
22
+ async def echo_stream(request: web.Request) -> web.StreamResponse:
23
+ return await utils.echo_stream(request)
 
24
 
25
 
26
  class ValidatorApplication(web.Application):
utils.py CHANGED
@@ -1,8 +1,10 @@
1
  import re
2
- import bittensor as bt
3
  import time
4
  import json
 
 
5
  from aiohttp import web
 
6
  from collections import Counter
7
  from prompting.rewards import DateRewardModel, FloatDiffModel
8
 
@@ -134,47 +136,45 @@ def guess_task_name(challenge: str):
134
  return "qa"
135
 
136
 
137
- async def echo_stream(request_data: dict):
 
138
  k = request_data.get("k", 1)
139
- exclude = request_data.get("exclude", [])
140
- timeout = request_data.get("timeout", 0.2)
141
  message = "\n\n".join(request_data["messages"])
142
 
143
  # Create a StreamResponse
144
  response = web.StreamResponse(
145
- status=200, reason="OK", headers={"Content-Type": "text/plain"}
146
  )
147
- await response.prepare()
148
 
149
  completion = ""
 
 
 
150
  # Echo the message k times with a timeout between each chunk
151
  for _ in range(k):
152
  for word in message.split():
153
  chunk = f"{word} "
154
  await response.write(chunk.encode("utf-8"))
155
  completion += chunk
156
- time.sleep(timeout)
157
  bt.logging.info(f"Echoed: {chunk}")
158
 
 
 
 
159
  completion = completion.strip()
160
 
161
  # Prepare final JSON chunk
162
- json_chunk = json.dumps(
163
- {
164
- "uids": [0],
165
- "completion": completion,
166
- "completions": [completion.strip()],
167
- "timings": [0],
168
- "status_messages": ["Went well!"],
169
- "status_codes": [200],
170
- "completion_is_valid": [True],
171
- "task_name": "echo",
172
- "ensemble_result": {},
173
- }
174
- )
175
 
176
  # Send the final JSON as part of the stream
177
- await response.write(f"\n\nJSON_RESPONSE_BEGIN:\n{json_chunk}".encode("utf-8"))
178
 
179
  # Finalize the response
180
  await response.write_eof()
 
1
  import re
 
2
  import time
3
  import json
4
+ import asyncio
5
+ import bittensor as bt
6
  from aiohttp import web
7
+ from responses import TextStreamResponse
8
  from collections import Counter
9
  from prompting.rewards import DateRewardModel, FloatDiffModel
10
 
 
136
  return "qa"
137
 
138
 
139
+ async def echo_stream(request: web.Request) -> web.StreamResponse:
140
+ request_data = request["data"]
141
  k = request_data.get("k", 1)
 
 
142
  message = "\n\n".join(request_data["messages"])
143
 
144
  # Create a StreamResponse
145
  response = web.StreamResponse(
146
+ status=200, reason="OK", headers={"Content-Type": "application/json"}
147
  )
148
+ await response.prepare(request)
149
 
150
  completion = ""
151
+ chunks = []
152
+ chunks_timings = []
153
+ start_time = time.time()
154
  # Echo the message k times with a timeout between each chunk
155
  for _ in range(k):
156
  for word in message.split():
157
  chunk = f"{word} "
158
  await response.write(chunk.encode("utf-8"))
159
  completion += chunk
160
+ await asyncio.sleep(0.3)
161
  bt.logging.info(f"Echoed: {chunk}")
162
 
163
+ chunks.append(chunk)
164
+ chunks_timings.append(time.time() - start_time)
165
+
166
  completion = completion.strip()
167
 
168
  # Prepare final JSON chunk
169
+ response_data = TextStreamResponse(
170
+ streamed_chunks=chunks,
171
+ streamed_chunks_timings=chunks_timings,
172
+ completion=completion,
173
+ timing=time.time() - start_time,
174
+ ).to_dict()
 
 
 
 
 
 
 
175
 
176
  # Send the final JSON as part of the stream
177
+ await response.write(json.dumps(response_data).encode("utf-8"))
178
 
179
  # Finalize the response
180
  await response.write_eof()
validators/base.py CHANGED
@@ -1,7 +1,7 @@
1
  from abc import ABC, abstractmethod
2
  from typing import List
3
  from dataclasses import dataclass
4
- from aiohttp.web import Response, Request
5
 
6
 
7
  @dataclass
@@ -31,10 +31,10 @@ class QueryValidatorParams:
31
 
32
  class ValidatorAPI(ABC):
33
  @abstractmethod
34
- async def query_validator(self, params: QueryValidatorParams) -> Response:
35
  pass
36
 
37
 
38
  class MockValidator(ValidatorAPI):
39
- async def query_validator(self, params: QueryValidatorParams) -> Response:
40
  ...
 
1
  from abc import ABC, abstractmethod
2
  from typing import List
3
  from dataclasses import dataclass
4
+ from aiohttp.web import Response, Request, StreamResponse
5
 
6
 
7
  @dataclass
 
31
 
32
  class ValidatorAPI(ABC):
33
  @abstractmethod
34
+ async def query_validator(self, params: QueryValidatorParams) -> StreamResponse:
35
  pass
36
 
37
 
38
  class MockValidator(ValidatorAPI):
39
+ async def query_validator(self, params: QueryValidatorParams) -> StreamResponse:
40
  ...
validators/sn1_validator_wrapper.py CHANGED
@@ -2,7 +2,8 @@ import json
2
  import utils
3
  import torch
4
  import traceback
5
- import asyncio
 
6
  import bittensor as bt
7
  from typing import Awaitable
8
  from prompting.validator import Validator
@@ -12,6 +13,16 @@ from prompting.dendrite import DendriteResponseEvent
12
  from .base import QueryValidatorParams, ValidatorAPI
13
  from aiohttp.web_response import Response, StreamResponse
14
  from deprecated import deprecated
 
 
 
 
 
 
 
 
 
 
15
 
16
 
17
  class S1ValidatorAPI(ValidatorAPI):
@@ -75,27 +86,39 @@ class S1ValidatorAPI(ValidatorAPI):
75
  return Response(status=500, reason="Internal error")
76
 
77
  async def process_response(
78
- self, response: StreamResponse, uid: int, async_generator: Awaitable
79
- ):
80
  """Process a single response asynchronously."""
81
- try:
82
- chunk = None # Initialize chunk with a default value
83
- async for chunk in async_generator: # most important loop, as this is where we acquire the final synapse.
84
- bt.logging.debug(f"\nchunk for uid {uid}: {chunk}")
85
-
86
- # TODO: SET PROPER IMPLEMENTATION TO RETURN CHUNK
87
- if chunk is not None:
88
- json_data = json.dumps(chunk)
89
- await response.write(json_data.encode("utf-8"))
90
-
91
- except Exception as e:
92
- bt.logging.error(
93
- f"Encountered an error in {self.__class__.__name__}:get_stream_response:\n{traceback.format_exc()}"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
94
  )
95
- response.set_status(500, reason="Internal error")
96
- await response.write(json.dumps({"error": str(e)}).encode("utf-8"))
97
- finally:
98
- await response.write_eof() # Ensure to close the response properly
99
 
100
  async def get_stream_response(self, params: QueryValidatorParams) -> StreamResponse:
101
  response = StreamResponse(status=200, reason="OK")
@@ -105,7 +128,7 @@ class S1ValidatorAPI(ValidatorAPI):
105
 
106
  try:
107
  # Guess the task name of current request
108
- task_name = utils.guess_task_name(params.messages[-1])
109
 
110
  # Get the list of uids to query for this step.
111
  uids = get_random_uids(
@@ -115,6 +138,8 @@ class S1ValidatorAPI(ValidatorAPI):
115
 
116
  # Make calls to the network with the prompt.
117
  bt.logging.info(f"Calling dendrite")
 
 
118
  streams_responses = await self.validator.dendrite(
119
  axons=axons,
120
  synapse=StreamPromptingSynapse(
@@ -125,13 +150,24 @@ class S1ValidatorAPI(ValidatorAPI):
125
  streaming=True,
126
  )
127
 
128
- tasks = [
129
- self.process_response(uid, res)
130
- for uid, res in dict(zip(uids, streams_responses))
131
- ]
132
- results = await asyncio.gather(*tasks, return_exceptions=True)
133
 
134
- # TODO: Continue implementation, business decision needs to be made on how to handle the results
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
135
  except Exception as e:
136
  bt.logging.error(
137
  f"Encountered an error in {self.__class__.__name__}:get_stream_response:\n{traceback.format_exc()}"
@@ -144,11 +180,4 @@ class S1ValidatorAPI(ValidatorAPI):
144
  return response
145
 
146
  async def query_validator(self, params: QueryValidatorParams) -> Response:
147
- # TODO: SET STREAM AS DEFAULT
148
- stream = params.request.get("stream", False)
149
-
150
- if stream:
151
- return await self.get_stream_response(params)
152
- else:
153
- # DEPRECATED
154
- return await self.get_response(params)
 
2
  import utils
3
  import torch
4
  import traceback
5
+ import time
6
+ import random
7
  import bittensor as bt
8
  from typing import Awaitable
9
  from prompting.validator import Validator
 
13
  from .base import QueryValidatorParams, ValidatorAPI
14
  from aiohttp.web_response import Response, StreamResponse
15
  from deprecated import deprecated
16
+ from dataclasses import dataclass
17
+ from typing import List
18
+ from responses import TextStreamResponse
19
+
20
+
21
+ @dataclass
22
+ class ProcessedStreamResponse:
23
+ streamed_chunks: List[str]
24
+ streamed_chunks_timings: List[float]
25
+ synapse: StreamPromptingSynapse
26
 
27
 
28
  class S1ValidatorAPI(ValidatorAPI):
 
86
  return Response(status=500, reason="Internal error")
87
 
88
  async def process_response(
89
+ self, response: StreamResponse, async_generator: Awaitable
90
+ ) -> ProcessedStreamResponse:
91
  """Process a single response asynchronously."""
92
+ # Initialize chunk with a default value
93
+ chunk = None
94
+ # Initialize chunk array to accumulate streamed chunks
95
+ chunks = []
96
+ chunks_timings = []
97
+
98
+ start_time = time.time()
99
+ last_sent_index = 0
100
+ async for chunk in async_generator:
101
+ if isinstance(chunk, list):
102
+ # Chunks are currently returned in string arrays, so we need to concatenate them
103
+ concatenated_chunks = "".join(chunk)
104
+ new_data = concatenated_chunks[last_sent_index:]
105
+
106
+ if new_data:
107
+ await response.write(new_data.encode("utf-8"))
108
+ bt.logging.info(f"Received new chunk from miner: {chunk}")
109
+ last_sent_index += len(new_data)
110
+ chunks.extend(chunk)
111
+ chunks_timings.append(time.time() - start_time)
112
+
113
+ if chunk is not None and isinstance(chunk, StreamPromptingSynapse):
114
+ # Assuming the last chunk holds the last value yielded which should be a synapse with the completion filled
115
+ return ProcessedStreamResponse(
116
+ synapse=chunk,
117
+ streamed_chunks=chunks,
118
+ streamed_chunks_timings=chunks_timings,
119
  )
120
+ else:
121
+ raise ValueError("The last chunkis not a StreamPrompting synapse")
 
 
122
 
123
  async def get_stream_response(self, params: QueryValidatorParams) -> StreamResponse:
124
  response = StreamResponse(status=200, reason="OK")
 
128
 
129
  try:
130
  # Guess the task name of current request
131
+ # task_name = utils.guess_task_name(params.messages[-1])
132
 
133
  # Get the list of uids to query for this step.
134
  uids = get_random_uids(
 
138
 
139
  # Make calls to the network with the prompt.
140
  bt.logging.info(f"Calling dendrite")
141
+ start_time = time.time()
142
+
143
  streams_responses = await self.validator.dendrite(
144
  axons=axons,
145
  synapse=StreamPromptingSynapse(
 
150
  streaming=True,
151
  )
152
 
153
+ uid_stream_dict = dict(zip(uids, streams_responses))
 
 
 
 
154
 
155
+ random_uid, random_stream = random.choice(list(uid_stream_dict.items()))
156
+ processed_response = await self.process_response(response, random_stream)
157
+
158
+ # Prepare final JSON chunk
159
+ response_data = json.dumps(
160
+ TextStreamResponse(
161
+ streamed_chunks=processed_response.streamed_chunks,
162
+ streamed_chunks_timings=processed_response.streamed_chunks_timings,
163
+ uid=random_uid,
164
+ completion=processed_response.synapse.completion,
165
+ timing=time.time() - start_time,
166
+ ).to_dict()
167
+ )
168
+
169
+ # Send the final JSON as part of the stream
170
+ await response.write(json.dumps(response_data).encode("utf-8"))
171
  except Exception as e:
172
  bt.logging.error(
173
  f"Encountered an error in {self.__class__.__name__}:get_stream_response:\n{traceback.format_exc()}"
 
180
  return response
181
 
182
  async def query_validator(self, params: QueryValidatorParams) -> Response:
183
+ return await self.get_stream_response(params)