dalyzhou commited on
Commit
32c9a94
·
1 Parent(s): df77864

Add application files

Browse files
Files changed (4) hide show
  1. Dockerfile +16 -0
  2. README.md +50 -7
  3. app.py +1783 -0
  4. requirements.txt +6 -0
Dockerfile ADDED
@@ -0,0 +1,16 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Read the doc: https://huggingface.co/docs/hub/spaces-sdks-docker
2
+ # you will also find guides on how best to write your Dockerfile
3
+
4
+ FROM python:3.9
5
+
6
+ RUN useradd -m -u 1000 user
7
+ USER user
8
+ ENV PATH="/home/user/.local/bin:$PATH"
9
+
10
+ WORKDIR /app
11
+
12
+ COPY --chown=user ./requirements.txt requirements.txt
13
+ RUN pip install --no-cache-dir --upgrade -r requirements.txt
14
+
15
+ COPY --chown=user . /app
16
+ CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "7860"]
README.md CHANGED
@@ -1,11 +1,54 @@
1
  ---
2
- title: Ki2api
3
- emoji: 👁
4
- colorFrom: pink
5
- colorTo: yellow
6
  sdk: docker
7
- pinned: false
8
- license: mit
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: Ki2API - Claude Sonnet 4 OpenAI Compatible API
3
+ emoji: 🤖
4
+ colorFrom: blue
5
+ colorTo: purple
6
  sdk: docker
7
+ app_port: 7860
 
8
  ---
9
 
10
+ # Ki2API - Claude Sonnet 4 OpenAI Compatible API
11
+
12
+ OpenAI-compatible API for Claude Sonnet 4 via AWS CodeWhisperer. This service provides streaming support, tool calls, and multiple model access through a familiar OpenAI API interface.
13
+
14
+ ## Features
15
+
16
+ - 🔄 **Streaming Support**: Real-time response streaming
17
+ - 🛠️ **Tool Calls**: Function calling capabilities
18
+ - 🎯 **Multiple Models**: Support for Claude Sonnet 4 and Claude 3.5 Haiku
19
+ - 🔧 **XML Tool Parsing**: Advanced tool call parsing
20
+ - 🔄 **Auto Token Refresh**: Automatic authentication token management
21
+ - 🛡️ **Null Content Handling**: Robust message processing
22
+ - 🔍 **Tool Call Deduplication**: Prevents duplicate function calls
23
+
24
+ ## API Endpoints
25
+
26
+ - `GET /v1/models` - List available models
27
+ - `POST /v1/chat/completions` - Create chat completions
28
+ - `GET /health` - Health check
29
+ - `GET /` - Service information
30
+
31
+ ## Environment Variables
32
+
33
+ Required environment variables:
34
+ - `API_KEY` - Bearer token for API authentication (default: ki2api-key-2024)
35
+ - `KIRO_ACCESS_TOKEN` - Kiro access token
36
+ - `KIRO_REFRESH_TOKEN` - Kiro refresh token
37
+
38
+ ## Usage
39
+
40
+ ```bash
41
+ curl -X POST https://your-space-url/v1/chat/completions \n -H "Authorization: Bearer ki2api-key-2024" \n -H "Content-Type: application/json" \n -d '{
42
+ "model": "claude-sonnet-4-20250514",
43
+ "messages": [
44
+ {"role": "user", "content": "Hello!"}
45
+ ]
46
+ }'
47
+ ```
48
+
49
+ ## Supported Models
50
+
51
+ - `claude-sonnet-4-20250514` - Claude Sonnet 4 (Latest)
52
+ - `claude-3-5-haiku-20241022` - Claude 3.5 Haiku
53
+
54
+ Built with FastAPI and optimized for Hugging Face Spaces deployment.
app.py ADDED
@@ -0,0 +1,1783 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os
2
+ import json
3
+ import time
4
+ import uuid
5
+ import httpx
6
+ import re
7
+ import asyncio
8
+ import xml.etree.ElementTree as ET
9
+ import logging
10
+ import struct
11
+ import base64
12
+ import copy
13
+ from fastapi import FastAPI, HTTPException, Request, Header, Depends
14
+ from fastapi.responses import StreamingResponse
15
+ from pydantic import BaseModel, Field
16
+ from typing import List, Optional, Dict, Any, Union
17
+ from dotenv import load_dotenv
18
+ from json_repair import repair_json
19
+
20
+ # Configure logging
21
+ # logging.basicConfig(level=logging.INFO) # for dev
22
+ logging.basicConfig(level=logging.WARNING)
23
+ logger = logging.getLogger(__name__)
24
+
25
+ # Load environment variables
26
+ load_dotenv()
27
+
28
+ # Initialize FastAPI app
29
+ app = FastAPI(
30
+ title="Ki2API - Claude Sonnet 4 OpenAI Compatible API",
31
+ description="OpenAI-compatible API for Claude Sonnet 4 via AWS CodeWhisperer",
32
+ version="3.0.1"
33
+ )
34
+
35
+ # Configuration
36
+ API_KEY = os.getenv("API_KEY", "ki2api-key-2024")
37
+ KIRO_ACCESS_TOKEN = os.getenv("KIRO_ACCESS_TOKEN")
38
+ KIRO_REFRESH_TOKEN = os.getenv("KIRO_REFRESH_TOKEN")
39
+ KIRO_BASE_URL = "https://codewhisperer.us-east-1.amazonaws.com/generateAssistantResponse"
40
+ PROFILE_ARN = "arn:aws:codewhisperer:us-east-1:699475941385:profile/EHGA3GRVQMUK"
41
+
42
+ # Model mapping
43
+ MODEL_MAP = {
44
+ "claude-sonnet-4-20250514": "CLAUDE_SONNET_4_20250514_V1_0",
45
+ "claude-3-5-haiku-20241022": "CLAUDE_3_7_SONNET_20250219_V1_0",
46
+ }
47
+ DEFAULT_MODEL = "claude-sonnet-4-20250514"
48
+
49
+ # Pydantic models for OpenAI compatibility
50
+ class ImageUrl(BaseModel):
51
+ url: str
52
+ detail: Optional[str] = "auto"
53
+
54
+ class ContentPart(BaseModel):
55
+ type: str
56
+ text: Optional[str] = None
57
+ image_url: Optional[ImageUrl] = None
58
+
59
+ class ToolCall(BaseModel):
60
+ id: str
61
+ type: str = "function"
62
+ function: Dict[str, Any]
63
+ class ChatMessage(BaseModel):
64
+ role: str
65
+ content: Union[str, List[ContentPart], None]
66
+ tool_calls: Optional[List[ToolCall]] = None
67
+ tool_call_id: Optional[str] = None # 用于 tool 角色的消息
68
+
69
+ def get_content_text(self) -> str:
70
+ """Extract text content from either string or content parts"""
71
+ # Handle None content
72
+ if self.content is None:
73
+ logger.warning(f"Message with role '{self.role}' has None content")
74
+ return ""
75
+
76
+ if isinstance(self.content, str):
77
+ return self.content
78
+ elif isinstance(self.content, list):
79
+ text_parts = []
80
+ for part in self.content:
81
+ if isinstance(part, dict):
82
+ if part.get("type") == "text" and "text" in part:
83
+ text_parts.append(part.get("text", ""))
84
+ elif part.get("type") == "tool_result" and "content" in part:
85
+ text_parts.append(part.get("content", ""))
86
+ elif hasattr(part, 'text') and part.text:
87
+ text_parts.append(part.text)
88
+ return "".join(text_parts)
89
+ else:
90
+ logger.warning(f"Unexpected content type: {type(self.content)}")
91
+ return str(self.content) if self.content else ""
92
+
93
+ class Function(BaseModel):
94
+ name: str
95
+ description: Optional[str] = None
96
+ parameters: Optional[Dict[str, Any]] = None
97
+
98
+ class Tool(BaseModel):
99
+ type: str = "function"
100
+ function: Function
101
+
102
+
103
+
104
+ class ChatCompletionRequest(BaseModel):
105
+ model: str
106
+ messages: List[ChatMessage]
107
+ temperature: Optional[float] = 0.7
108
+ max_tokens: Optional[int] = 4000
109
+ stream: Optional[bool] = False
110
+ top_p: Optional[float] = 1.0
111
+ frequency_penalty: Optional[float] = 0.0
112
+ presence_penalty: Optional[float] = 0.0
113
+ stop: Optional[Union[str, List[str]]] = None
114
+ user: Optional[str] = None
115
+ tools: Optional[List[Tool]] = None
116
+ tool_choice: Optional[Union[str, Dict[str, Any]]] = "auto"
117
+
118
+ class Usage(BaseModel):
119
+ prompt_tokens: int
120
+ completion_tokens: int
121
+ total_tokens: int
122
+ prompt_tokens_details: Optional[Dict[str, int]] = Field(default_factory=lambda: {"cached_tokens": 0})
123
+ completion_tokens_details: Optional[Dict[str, int]] = Field(default_factory=lambda: {"reasoning_tokens": 0})
124
+
125
+ class ResponseMessage(BaseModel):
126
+ role: str
127
+ content: Optional[str] = None
128
+ tool_calls: Optional[List[ToolCall]] = None
129
+
130
+ class Choice(BaseModel):
131
+ index: int
132
+ message: ResponseMessage
133
+ logprobs: Optional[Any] = None
134
+ finish_reason: str
135
+
136
+ class StreamChoice(BaseModel):
137
+ index: int
138
+ delta: Dict[str, Any]
139
+ logprobs: Optional[Any] = None
140
+ finish_reason: Optional[str] = None
141
+
142
+ class ChatCompletionResponse(BaseModel):
143
+ id: str = Field(default_factory=lambda: f"chatcmpl-{uuid.uuid4()}")
144
+ object: str = "chat.completion"
145
+ created: int = Field(default_factory=lambda: int(time.time()))
146
+ model: str
147
+ system_fingerprint: Optional[str] = "fp_ki2api_v3"
148
+ choices: List[Choice]
149
+ usage: Usage
150
+
151
+ class ChatCompletionStreamResponse(BaseModel):
152
+ id: str = Field(default_factory=lambda: f"chatcmpl-{uuid.uuid4()}")
153
+ object: str = "chat.completion.chunk"
154
+ created: int = Field(default_factory=lambda: int(time.time()))
155
+ model: str
156
+ system_fingerprint: Optional[str] = "fp_ki2api_v3"
157
+ choices: List[StreamChoice]
158
+ usage: Optional[Usage] = None
159
+
160
+ class ErrorResponse(BaseModel):
161
+ error: Dict[str, Any]
162
+
163
+ # Authentication
164
+ async def verify_api_key(authorization: str = Header(None)):
165
+ if not authorization:
166
+ raise HTTPException(
167
+ status_code=401,
168
+ detail={
169
+ "error": {
170
+ "message": "You didn't provide an API key.",
171
+ "type": "invalid_request_error",
172
+ "param": None,
173
+ "code": "invalid_api_key"
174
+ }
175
+ }
176
+ )
177
+
178
+ if not authorization.startswith("Bearer "):
179
+ raise HTTPException(
180
+ status_code=401,
181
+ detail={
182
+ "error": {
183
+ "message": "Invalid API key format. Expected 'Bearer <key>'",
184
+ "type": "invalid_request_error",
185
+ "param": None,
186
+ "code": "invalid_api_key"
187
+ }
188
+ }
189
+ )
190
+
191
+ api_key = authorization.replace("Bearer ", "")
192
+ if api_key != API_KEY:
193
+ raise HTTPException(
194
+ status_code=401,
195
+ detail={
196
+ "error": {
197
+ "message": "Invalid API key provided",
198
+ "type": "invalid_request_error",
199
+ "param": None,
200
+ "code": "invalid_api_key"
201
+ }
202
+ }
203
+ )
204
+ return api_key
205
+
206
+ # Token management
207
+ class TokenManager:
208
+ def __init__(self):
209
+ self.access_token = KIRO_ACCESS_TOKEN
210
+ self.refresh_token = KIRO_REFRESH_TOKEN
211
+ self.refresh_url = "https://prod.us-east-1.auth.desktop.kiro.dev/refreshToken"
212
+ self.last_refresh_time = 0
213
+ self.refresh_lock = asyncio.Lock()
214
+
215
+ async def refresh_tokens(self):
216
+ """刷新token,使用锁防止并发刷新请求"""
217
+ if not self.refresh_token:
218
+ logger.error("没有刷新token,无法刷新访问token")
219
+ return None
220
+
221
+ async with self.refresh_lock:
222
+ # 检查是否在短时间内已经刷新过
223
+ current_time = time.time()
224
+ if current_time - self.last_refresh_time < 5:
225
+ logger.info("最近已刷新token,使用现有token")
226
+ return self.access_token
227
+
228
+ try:
229
+ logger.info("开始刷新token...")
230
+ async with httpx.AsyncClient() as client:
231
+ response = await client.post(
232
+ self.refresh_url,
233
+ json={"refreshToken": self.refresh_token},
234
+ timeout=30
235
+ )
236
+ response.raise_for_status()
237
+
238
+ data = response.json()
239
+ if "accessToken" not in data:
240
+ logger.error(f"刷新token响应中没有accessToken: {data}")
241
+ return None
242
+
243
+ self.access_token = data.get("accessToken")
244
+ self.last_refresh_time = current_time
245
+ logger.info("token刷新成功")
246
+
247
+ # 更新环境变量
248
+ os.environ["KIRO_ACCESS_TOKEN"] = self.access_token
249
+
250
+ return self.access_token
251
+ except Exception as e:
252
+ logger.error(f"token刷新失败: {str(e)}")
253
+ return None
254
+
255
+ def get_token(self):
256
+ return self.access_token
257
+
258
+ token_manager = TokenManager()
259
+
260
+ # XML Tool Call Parser (from version 1)
261
+ def parse_xml_tool_calls(response_text: str) -> Optional[List[ToolCall]]:
262
+ """解析CodeWhisperer返回的XML格式工具调用,转换为OpenAI格式"""
263
+ if not response_text:
264
+ return None
265
+
266
+ tool_calls = []
267
+
268
+ logger.info(f"🔍 开始解析XML工具调用,响应文本长度: {len(response_text)}")
269
+
270
+ # 方法1: 解析 <tool_use> 标签格式
271
+ tool_use_pattern = r'<tool_use>\s*<tool_name>([^<]+)</tool_name>\s*<tool_parameter_name>([^<]+)</tool_parameter_name>\s*<tool_parameter_value>([^<]*)</tool_parameter_value>\s*</tool_use>'
272
+ matches = re.finditer(tool_use_pattern, response_text, re.DOTALL | re.IGNORECASE)
273
+
274
+ for match in matches:
275
+ function_name = match.group(1).strip()
276
+ param_name = match.group(2).strip()
277
+ param_value = match.group(3).strip()
278
+
279
+ arguments = {param_name: param_value}
280
+ tool_call_id = f"call_{uuid.uuid4().hex[:8]}"
281
+
282
+ tool_call = ToolCall(
283
+ id=tool_call_id,
284
+ type="function",
285
+ function={
286
+ "name": function_name,
287
+ "arguments": json.dumps(arguments, ensure_ascii=False)
288
+ }
289
+ )
290
+ tool_calls.append(tool_call)
291
+ logger.info(f"✅ 解析到工具调用: {function_name} with {param_name}={param_value}")
292
+
293
+ # 方法2: 解析简单的 <tool_name> 格式
294
+ if not tool_calls:
295
+ simple_pattern = r'<tool_name>([^<]+)</tool_name>\s*<tool_parameter_name>([^<]+)</tool_parameter_name>\s*<tool_parameter_value>([^<]*)</tool_parameter_value>'
296
+ matches = re.finditer(simple_pattern, response_text, re.DOTALL | re.IGNORECASE)
297
+
298
+ for match in matches:
299
+ function_name = match.group(1).strip()
300
+ param_name = match.group(2).strip()
301
+ param_value = match.group(3).strip()
302
+
303
+ arguments = {param_name: param_value}
304
+ tool_call_id = f"call_{uuid.uuid4().hex[:8]}"
305
+
306
+ tool_call = ToolCall(
307
+ id=tool_call_id,
308
+ type="function",
309
+ function={
310
+ "name": function_name,
311
+ "arguments": json.dumps(arguments, ensure_ascii=False)
312
+ }
313
+ )
314
+ tool_calls.append(tool_call)
315
+ logger.info(f"✅ 解析到简单工具调用: {function_name} with {param_name}={param_value}")
316
+
317
+ # 方法3: 解析只有工具名的情况
318
+ if not tool_calls:
319
+ name_only_pattern = r'<tool_name>([^<]+)</tool_name>'
320
+ matches = re.finditer(name_only_pattern, response_text, re.IGNORECASE)
321
+
322
+ for match in matches:
323
+ function_name = match.group(1).strip()
324
+ tool_call_id = f"call_{uuid.uuid4().hex[:8]}"
325
+
326
+ tool_call = ToolCall(
327
+ id=tool_call_id,
328
+ type="function",
329
+ function={
330
+ "name": function_name,
331
+ "arguments": "{}"
332
+ }
333
+ )
334
+ tool_calls.append(tool_call)
335
+ logger.info(f"✅ 解析到无参数工具调用: {function_name}")
336
+
337
+ if tool_calls:
338
+ logger.info(f"🎉 总共解析出 {len(tool_calls)} 个工具调用")
339
+ return tool_calls
340
+ else:
341
+ logger.info("❌ 未发现任何XML格式的工具调用")
342
+ return None
343
+
344
+ def find_matching_bracket(text: str, start_pos: int) -> int:
345
+ """找到匹配的结束括号位置"""
346
+ logger.info(f"🔧 FIND BRACKET: text length={len(text)}, start_pos={start_pos}")
347
+ logger.info(f"🔧 FIND BRACKET: First 100 chars: >>>{text[:100]}<<<")
348
+
349
+ if not text or start_pos >= len(text) or text[start_pos] != '[':
350
+ logger.info(f"🔧 FIND BRACKET: Early return -1, text[start_pos]={text[start_pos] if start_pos < len(text) else 'OOB'}")
351
+ return -1
352
+
353
+ bracket_count = 1
354
+ in_string = False
355
+ escape_next = False
356
+
357
+ logger.info(f"🔧 FIND BRACKET: Starting search from position {start_pos + 1}")
358
+
359
+ for i in range(start_pos + 1, len(text)):
360
+ char = text[i]
361
+
362
+ if escape_next:
363
+ escape_next = False
364
+ continue
365
+
366
+ if char == '\\' and in_string:
367
+ escape_next = True
368
+ continue
369
+
370
+ if char == '"' and not escape_next:
371
+ in_string = not in_string
372
+ logger.info(f"🔧 FIND BRACKET: Toggle string mode at {i}, in_string={in_string}")
373
+ continue
374
+
375
+ if not in_string:
376
+ if char == '[':
377
+ bracket_count += 1
378
+ logger.info(f"🔧 FIND BRACKET: [ at {i}, bracket_count={bracket_count}")
379
+ elif char == ']':
380
+ bracket_count -= 1
381
+ logger.info(f"🔧 FIND BRACKET: ] at {i}, bracket_count={bracket_count}")
382
+ if bracket_count == 0: # 只检查方括号匹配,不管花括号
383
+ logger.info(f"🔧 FIND BRACKET: Found matching ] at position {i}")
384
+ logger.info(f"🔧 FIND BRACKET: Complete match: >>>{text[start_pos:i+1]}<<<")
385
+ return i
386
+
387
+ logger.info(f"🔧 FIND BRACKET: No matching bracket found, returning -1")
388
+ logger.info(f"🔧 FIND BRACKET: Final bracket_count={bracket_count}")
389
+ return -1
390
+
391
+ def parse_single_tool_call_professional(tool_call_text: str) -> Optional[ToolCall]:
392
+ """专业的工具调用解析器 - 使用json_repair库"""
393
+ logger.info(f"🔧 开始解析工具调用文本 (长度: {len(tool_call_text)})")
394
+
395
+ # 步骤1: 提取函数名
396
+ name_pattern = r'\[Called\s+(\w+)\s+with\s+args:'
397
+ name_match = re.search(name_pattern, tool_call_text, re.IGNORECASE)
398
+
399
+ if not name_match:
400
+ logger.warning("⚠️ 无法从文本中提取函数名")
401
+ return None
402
+
403
+ function_name = name_match.group(1).strip()
404
+ logger.info(f"✅ 提取到函数名: {function_name}")
405
+
406
+ # 步骤2: 提取JSON参数部分
407
+ # 找到 "with args:" 之后的位置
408
+ args_start_marker = "with args:"
409
+ args_start_pos = tool_call_text.lower().find(args_start_marker.lower())
410
+ if args_start_pos == -1:
411
+ logger.error("❌ 找不到 'with args:' 标记")
412
+ return None
413
+
414
+ # 从 "with args:" 后开始
415
+ args_start = args_start_pos + len(args_start_marker)
416
+
417
+ # 找到最后的 ']'
418
+ args_end = tool_call_text.rfind(']')
419
+ if args_end <= args_start:
420
+ logger.error("❌ 找不到结束的 ']'")
421
+ return None
422
+
423
+ # 提取可能包含JSON的部分
424
+ json_candidate = tool_call_text[args_start:args_end].strip()
425
+ logger.info(f"📝 提取的JSON候选文本长度: {len(json_candidate)}")
426
+
427
+ # 步骤3: 修复并解析JSON
428
+ try:
429
+ # 使用json_repair修复可能损坏的JSON
430
+ repaired_json = repair_json(json_candidate)
431
+ logger.info(f"🔧 JSON修复完成,修复后长度: {len(repaired_json)}")
432
+
433
+ # 解析修复后的JSON
434
+ arguments = json.loads(repaired_json)
435
+
436
+ # 验证解析结果是字典
437
+ if not isinstance(arguments, dict):
438
+ logger.error(f"❌ 解析结果不是字典类型: {type(arguments)}")
439
+ return None
440
+
441
+ # 创建工具调用对象
442
+ tool_call_id = f"call_{uuid.uuid4().hex[:8]}"
443
+ tool_call = ToolCall(
444
+ id=tool_call_id,
445
+ type="function",
446
+ function={
447
+ "name": function_name,
448
+ "arguments": json.dumps(arguments, ensure_ascii=False)
449
+ }
450
+ )
451
+
452
+ logger.info(f"✅ 成功创建工具调用: {function_name} (参数键: {list(arguments.keys())})")
453
+ return tool_call
454
+
455
+ except Exception as e:
456
+ logger.error(f"❌ JSON修复/解析失败: {type(e).__name__}: {str(e)}")
457
+
458
+ # 备用方案:尝试更激进的修复
459
+ try:
460
+ # 查找第一个 { 和最后一个 }
461
+ first_brace = json_candidate.find('{')
462
+ last_brace = json_candidate.rfind('}')
463
+
464
+ if first_brace != -1 and last_brace > first_brace:
465
+ core_json = json_candidate[first_brace:last_brace + 1]
466
+
467
+ # 再次尝试修复
468
+ repaired_core = repair_json(core_json)
469
+ arguments = json.loads(repaired_core)
470
+
471
+ if isinstance(arguments, dict):
472
+ tool_call_id = f"call_{uuid.uuid4().hex[:8]}"
473
+ tool_call = ToolCall(
474
+ id=tool_call_id,
475
+ type="function",
476
+ function={
477
+ "name": function_name,
478
+ "arguments": json.dumps(arguments, ensure_ascii=False)
479
+ }
480
+ )
481
+ logger.info(f"✅ 备用方案成功: {function_name}")
482
+ return tool_call
483
+
484
+ except Exception as backup_error:
485
+ logger.error(f"❌ 备用方案也失败了: {backup_error}")
486
+
487
+ return None
488
+
489
+ def parse_bracket_tool_calls_professional(response_text: str) -> Optional[List[ToolCall]]:
490
+ """专业的批量工具调用解析器"""
491
+ if not response_text or "[Called" not in response_text:
492
+ logger.info("📭 响应文本中没有工具调用标记")
493
+ return None
494
+
495
+ tool_calls = []
496
+ errors = []
497
+
498
+ # 方法1: 使用改进的分割方法
499
+ try:
500
+ # 找到所有 [Called 的位置
501
+ call_positions = []
502
+ start = 0
503
+ while True:
504
+ pos = response_text.find("[Called", start)
505
+ if pos == -1:
506
+ break
507
+ call_positions.append(pos)
508
+ start = pos + 1
509
+
510
+ logger.info(f"🔍 找到 {len(call_positions)} 个潜在的工具调用")
511
+
512
+ for i, start_pos in enumerate(call_positions):
513
+ # 确定这个工具调用的结束位置
514
+ # 可能是下一个 [Called 的位置,或者文本结束
515
+ if i + 1 < len(call_positions):
516
+ end_search_limit = call_positions[i + 1]
517
+ else:
518
+ end_search_limit = len(response_text)
519
+
520
+ # 在限定范围内查找结束的 ]
521
+ segment = response_text[start_pos:end_search_limit]
522
+
523
+ # 查找匹配的结束括号
524
+ bracket_count = 0
525
+ end_pos = -1
526
+
527
+ for j, char in enumerate(segment):
528
+ if char == '[':
529
+ bracket_count += 1
530
+ elif char == ']':
531
+ bracket_count -= 1
532
+ if bracket_count == 0:
533
+ end_pos = start_pos + j
534
+ break
535
+
536
+ if end_pos == -1:
537
+ # 如果没找到匹配的括号,尝试找最后一个 ]
538
+ last_bracket = segment.rfind(']')
539
+ if last_bracket != -1:
540
+ end_pos = start_pos + last_bracket
541
+ else:
542
+ logger.warning(f"⚠️ 工具调用 {i+1} 没有找到结束括号")
543
+ continue
544
+
545
+ # 提取工具调用文本
546
+ tool_call_text = response_text[start_pos:end_pos + 1]
547
+ logger.info(f"📋 提取工具调用 {i+1}, 长度: {len(tool_call_text)}")
548
+
549
+ # 解析单个工具调用
550
+ parsed_call = parse_single_tool_call_professional(tool_call_text)
551
+ if parsed_call:
552
+ tool_calls.append(parsed_call)
553
+ else:
554
+ errors.append(f"工具调用 {i+1} 解析失败")
555
+
556
+ except Exception as e:
557
+ logger.error(f"❌ 批量解析过程出错: {type(e).__name__}: {str(e)}")
558
+ import traceback
559
+ traceback.print_exc()
560
+
561
+ # 记录结果
562
+ if tool_calls:
563
+ logger.info(f"🎉 成功解析 {len(tool_calls)} 个工具调用")
564
+ for tc in tool_calls:
565
+ logger.info(f" ✓ {tc.function['name']} (ID: {tc.id})")
566
+
567
+ if errors:
568
+ logger.warning(f"⚠️ 有 {len(errors)} 个解析失败:")
569
+ for error in errors:
570
+ logger.warning(f" ✗ {error}")
571
+
572
+ return tool_calls if tool_calls else None
573
+
574
+ # 为了确保兼容性,也更新原来的函数名
575
+ def parse_bracket_tool_calls(response_text: str) -> Optional[List[ToolCall]]:
576
+ """向后兼容的函数名"""
577
+ return parse_bracket_tool_calls_professional(response_text)
578
+
579
+ def parse_single_tool_call(tool_call_text: str) -> Optional[ToolCall]:
580
+ """向后兼容的函数名"""
581
+ return parse_single_tool_call_professional(tool_call_text)
582
+
583
+ # Add deduplication function
584
+ def deduplicate_tool_calls(tool_calls: List[Union[Dict, ToolCall]]) -> List[ToolCall]:
585
+ """Deduplicate tool calls based on function name and arguments"""
586
+ seen = set()
587
+ unique_tool_calls = []
588
+
589
+ for tool_call in tool_calls:
590
+ # Convert to ToolCall if it's a dict
591
+ if isinstance(tool_call, dict):
592
+ tc = ToolCall(
593
+ id=tool_call.get("id", f"call_{uuid.uuid4().hex[:8]}"),
594
+ type=tool_call.get("type", "function"),
595
+ function=tool_call.get("function", {})
596
+ )
597
+ else:
598
+ tc = tool_call
599
+
600
+ # Create unique key based on function name and arguments
601
+ key = (
602
+ tc.function.get("name", ""),
603
+ tc.function.get("arguments", "")
604
+ )
605
+
606
+ if key not in seen:
607
+ seen.add(key)
608
+ unique_tool_calls.append(tc)
609
+ else:
610
+ logger.info(f"🔄 Skipping duplicate tool call: {tc.function.get('name', 'unknown')}")
611
+
612
+ return unique_tool_calls
613
+
614
+ def build_codewhisperer_request(request: ChatCompletionRequest):
615
+ codewhisperer_model = MODEL_MAP.get(request.model, MODEL_MAP[DEFAULT_MODEL])
616
+ conversation_id = str(uuid.uuid4())
617
+
618
+ # Extract system prompt and user messages
619
+ system_prompt = ""
620
+ conversation_messages = []
621
+
622
+ for msg in request.messages:
623
+ if msg.role == "system":
624
+ system_prompt = msg.get_content_text()
625
+ elif msg.role in ["user", "assistant", "tool"]:
626
+ conversation_messages.append(msg)
627
+
628
+ if not conversation_messages:
629
+ raise HTTPException(
630
+ status_code=400,
631
+ detail={
632
+ "error": {
633
+ "message": "No conversation messages found",
634
+ "type": "invalid_request_error",
635
+ "param": "messages",
636
+ "code": "invalid_request"
637
+ }
638
+ }
639
+ )
640
+
641
+ # Build history - only include user/assistant pairs
642
+ history = []
643
+
644
+ # Process history messages (all except the last one)
645
+ if len(conversation_messages) > 1:
646
+ history_messages = conversation_messages[:-1]
647
+
648
+ # Build user messages list (combining tool results with user messages)
649
+ processed_messages = []
650
+ i = 0
651
+ while i < len(history_messages):
652
+ msg = history_messages[i]
653
+
654
+ if msg.role == "user":
655
+ content = msg.get_content_text() or "Continue"
656
+ processed_messages.append(("user", content))
657
+ i += 1
658
+ elif msg.role == "assistant":
659
+ # Check if this assistant message contains tool calls
660
+ if hasattr(msg, 'tool_calls') and msg.tool_calls:
661
+ # Build a description of the tool calls
662
+ tool_descriptions = []
663
+ for tc in msg.tool_calls:
664
+ func_name = tc.function.get("name", "unknown") if isinstance(tc.function, dict) else "unknown"
665
+ args = tc.function.get("arguments", "{}") if isinstance(tc.function, dict) else "{}"
666
+ tool_descriptions.append(f"[Called {func_name} with args: {args}]")
667
+ content = " ".join(tool_descriptions)
668
+ logger.info(f"📌 Processing assistant message with tool calls: {content}")
669
+ else:
670
+ content = msg.get_content_text() or "I understand."
671
+ processed_messages.append(("assistant", content))
672
+ i += 1
673
+ elif msg.role == "tool":
674
+ # Combine tool results into the next user message
675
+ tool_content = msg.get_content_text() or "[Tool executed]"
676
+ tool_call_id = getattr(msg, 'tool_call_id', 'unknown')
677
+
678
+ # Format tool result with ID for tracking
679
+ formatted_tool_result = f"[Tool result for {tool_call_id}]: {tool_content}"
680
+
681
+ # Look ahead to see if there's a user message
682
+ if i + 1 < len(history_messages) and history_messages[i + 1].role == "user":
683
+ user_content = history_messages[i + 1].get_content_text() or ""
684
+ combined_content = f"{formatted_tool_result}\n{user_content}".strip()
685
+ processed_messages.append(("user", combined_content))
686
+ i += 2
687
+ else:
688
+ # Tool result without following user message - add as user message
689
+ processed_messages.append(("user", formatted_tool_result))
690
+ i += 1
691
+ else:
692
+ i += 1
693
+
694
+ # Build history pairs
695
+ i = 0
696
+ while i < len(processed_messages):
697
+ role, content = processed_messages[i]
698
+
699
+ if role == "user":
700
+ history.append({
701
+ "userInputMessage": {
702
+ "content": content,
703
+ "modelId": codewhisperer_model,
704
+ "origin": "AI_EDITOR"
705
+ }
706
+ })
707
+
708
+ # Look for assistant response
709
+ if i + 1 < len(processed_messages) and processed_messages[i + 1][0] == "assistant":
710
+ _, assistant_content = processed_messages[i + 1]
711
+ history.append({
712
+ "assistantResponseMessage": {
713
+ "content": assistant_content
714
+ }
715
+ })
716
+ i += 2
717
+ else:
718
+ # No assistant response, add a placeholder
719
+ history.append({
720
+ "assistantResponseMessage": {
721
+ "content": "I understand."
722
+ }
723
+ })
724
+ i += 1
725
+ elif role == "assistant":
726
+ # Orphaned assistant message
727
+ history.append({
728
+ "userInputMessage": {
729
+ "content": "Continue",
730
+ "modelId": codewhisperer_model,
731
+ "origin": "AI_EDITOR"
732
+ }
733
+ })
734
+ history.append({
735
+ "assistantResponseMessage": {
736
+ "content": content
737
+ }
738
+ })
739
+ i += 1
740
+ else:
741
+ i += 1
742
+
743
+ # Build current message
744
+ current_message = conversation_messages[-1]
745
+
746
+ # Handle images in the last message
747
+ images = []
748
+ if isinstance(current_message.content, list):
749
+ for part in current_message.content:
750
+ if part.type == "image_url" and part.image_url:
751
+ try:
752
+ # 记录原始 URL 的前 50 个字符,用于调试
753
+ logger.info(f"🔍 处理图片 URL: {part.image_url.url[:50]}...")
754
+
755
+ # 检查 URL 格式是否正确
756
+ if not part.image_url.url.startswith("data:image/"):
757
+ logger.error(f"❌ 图片 URL 格式不正确,应该以 'data:image/' 开头")
758
+ continue
759
+
760
+ # Correctly parse the data URI
761
+ # format: data:image/jpeg;base64,{base64_string}
762
+ header, encoded_data = part.image_url.url.split(",", 1)
763
+
764
+ # Correctly parse the image format from the mime type
765
+ # "data:image/jpeg;base64" -> "jpeg"
766
+ # Use regex to reliably extract image format, e.g., "jpeg" from "data:image/jpeg;base64"
767
+ match = re.search(r'image/(\w+)', header)
768
+ if match:
769
+ image_format = match.group(1)
770
+ # 验证 Base64 编码是否有效
771
+ try:
772
+ base64.b64decode(encoded_data)
773
+ logger.info("✅ Base64 编码验证通过")
774
+ except Exception as e:
775
+ logger.error(f"❌ Base64 编码无效: {e}")
776
+ continue
777
+
778
+ images.append({
779
+ "format": image_format,
780
+ "source": {"bytes": encoded_data}
781
+ })
782
+ logger.info(f"🖼️ 成功处理图片,格式: {image_format}, 大小: {len(encoded_data)} 字符")
783
+ else:
784
+ logger.warning(f"⚠️ 无法从头部确定图片格式: {header}")
785
+ except Exception as e:
786
+ logger.error(f"❌ 处理图片 URL 失败: {str(e)}")
787
+
788
+ current_content = current_message.get_content_text()
789
+
790
+ # Handle different roles for current message
791
+ if current_message.role == "tool":
792
+ # For tool results, format them properly and mark as completed
793
+ tool_result = current_content or '[Tool executed]'
794
+ tool_call_id = getattr(current_message, 'tool_call_id', 'unknown')
795
+ current_content = f"[Tool execution completed for {tool_call_id}]: {tool_result}"
796
+
797
+ # Check if this tool result follows a tool call in history
798
+ if len(conversation_messages) > 1:
799
+ prev_message = conversation_messages[-2]
800
+ if prev_message.role == "assistant" and hasattr(prev_message, 'tool_calls') and prev_message.tool_calls:
801
+ # Find the corresponding tool call
802
+ for tc in prev_message.tool_calls:
803
+ if tc.id == tool_call_id:
804
+ func_name = tc.function.get("name", "unknown") if isinstance(tc.function, dict) else "unknown"
805
+ current_content = f"[Completed execution of {func_name}]: {tool_result}"
806
+ break
807
+ elif current_message.role == "assistant":
808
+ # If last message is from assistant with tool calls, format it appropriately
809
+ if hasattr(current_message, 'tool_calls') and current_message.tool_calls:
810
+ tool_descriptions = []
811
+ for tc in current_message.tool_calls:
812
+ func_name = tc.function.get("name", "unknown") if isinstance(tc.function, dict) else "unknown"
813
+ tool_descriptions.append(f"Continue after calling {func_name}")
814
+ current_content = "; ".join(tool_descriptions)
815
+ else:
816
+ current_content = "Continue the conversation"
817
+
818
+ # Ensure current message has content
819
+ if not current_content:
820
+ current_content = "Continue"
821
+
822
+ # Add system prompt to current message
823
+ if system_prompt:
824
+ current_content = f"{system_prompt}\n\n{current_content}"
825
+
826
+ # Build request
827
+ codewhisperer_request = {
828
+ "profileArn": PROFILE_ARN,
829
+ "conversationState": {
830
+ "chatTriggerType": "MANUAL",
831
+ "conversationId": conversation_id,
832
+ "currentMessage": {
833
+ "userInputMessage": {
834
+ "content": current_content,
835
+ "modelId": codewhisperer_model,
836
+ "origin": "AI_EDITOR"
837
+ }
838
+ },
839
+ "history": history
840
+ }
841
+ }
842
+
843
+ # Add context for tools
844
+ user_input_message_context = {}
845
+ if request.tools:
846
+ user_input_message_context["tools"] = [
847
+ {
848
+ "toolSpecification": {
849
+ "name": tool.function.name,
850
+ "description": tool.function.description or "",
851
+ "inputSchema": {"json": tool.function.parameters or {}}
852
+ }
853
+ } for tool in request.tools
854
+ ]
855
+
856
+ # 根据文档,images 应该是 userInputMessage 的直接子字段,而不是在 userInputMessageContext 中
857
+ if images:
858
+ # 直接添加到 userInputMessage 中
859
+ codewhisperer_request["conversationState"]["currentMessage"]["userInputMessage"]["images"] = images
860
+ logger.info(f"📊 添加了 {len(images)} 个图片到 userInputMessage 中")
861
+ for i, img in enumerate(images):
862
+ logger.info(f" - 图片 {i+1}: 格式={img['format']}, 大小={len(img['source']['bytes'])} 字符")
863
+ # 记录图片数据的前20个字符,用于调试
864
+ logger.info(f" - 图片数据前20字符: {img['source']['bytes'][:20]}...")
865
+ logger.info(f"✅ 成功添加 images 到 userInputMessage 中")
866
+
867
+ if user_input_message_context:
868
+ codewhisperer_request["conversationState"]["currentMessage"]["userInputMessage"]["userInputMessageContext"] = user_input_message_context
869
+ logger.info(f"✅ 成功添加 userInputMessageContext 到请求中")
870
+
871
+ # 创建一个用于日志记录的请求副本,避免记录完整的图片数据
872
+ log_request = copy.deepcopy(codewhisperer_request)
873
+ # 检查 images 是否在 userInputMessage 中
874
+ if "images" in log_request.get("conversationState", {}).get("currentMessage", {}).get("userInputMessage", {}):
875
+ for img in log_request["conversationState"]["currentMessage"]["userInputMessage"]["images"]:
876
+ if "bytes" in img.get("source", {}):
877
+ img["source"]["bytes"] = img["source"]["bytes"][:20] + "..." # 只记录前20个字符
878
+
879
+ logger.info(f"🔄 COMPLETE CODEWHISPERER REQUEST: {json.dumps(log_request, indent=2)}")
880
+ return codewhisperer_request
881
+ # AWS Event Stream Parser (from version 2)
882
+ class CodeWhispererStreamParser:
883
+ def __init__(self):
884
+ self.buffer = b''
885
+ self.error_count = 0
886
+ self.max_errors = 5
887
+
888
+ def parse(self, chunk: bytes) -> List[Dict[str, Any]]:
889
+ """解析AWS事件流格式的数据块"""
890
+ self.buffer += chunk
891
+ logger.debug(f"Parser received {len(chunk)} bytes. Buffer size: {len(self.buffer)}")
892
+ events = []
893
+
894
+ if len(self.buffer) < 12:
895
+ return []
896
+
897
+ while len(self.buffer) >= 12:
898
+ try:
899
+ header_bytes = self.buffer[0:8]
900
+ total_len, header_len = struct.unpack('>II', header_bytes)
901
+
902
+ # 安全检查
903
+ if total_len > 2000000 or header_len > 2000000:
904
+ logger.error(f"Unreasonable header values: total_len={total_len}, header_len={header_len}")
905
+ self.buffer = self.buffer[8:]
906
+ self.error_count += 1
907
+ if self.error_count > self.max_errors:
908
+ logger.error("Too many parsing errors, clearing buffer")
909
+ self.buffer = b''
910
+ continue
911
+
912
+ # 等待完整帧
913
+ if len(self.buffer) < total_len:
914
+ break
915
+
916
+ # 提取完整帧
917
+ frame = self.buffer[:total_len]
918
+ self.buffer = self.buffer[total_len:]
919
+
920
+ # 提取有效载荷
921
+ payload_start = 8 + header_len
922
+ payload_end = total_len - 4 # 减去尾部CRC
923
+
924
+ if payload_start >= payload_end or payload_end > len(frame):
925
+ logger.error(f"Invalid payload bounds")
926
+ continue
927
+
928
+ payload = frame[payload_start:payload_end]
929
+
930
+ # 解码有效载荷
931
+ try:
932
+ payload_str = payload.decode('utf-8', errors='ignore')
933
+
934
+ # 尝试解析JSON
935
+ json_start_index = payload_str.find('{')
936
+ if json_start_index != -1:
937
+ json_payload = payload_str[json_start_index:]
938
+ event_data = json.loads(json_payload)
939
+ events.append(event_data)
940
+ logger.debug(f"Successfully parsed event: {event_data}")
941
+ except json.JSONDecodeError as e:
942
+ logger.error(f"JSON decode error: {e}")
943
+ continue
944
+
945
+ except struct.error as e:
946
+ logger.error(f"Struct unpack error: {e}")
947
+ self.buffer = self.buffer[1:]
948
+ self.error_count += 1
949
+ if self.error_count > self.max_errors:
950
+ logger.error("Too many parsing errors, clearing buffer")
951
+ self.buffer = b''
952
+ except Exception as e:
953
+ logger.error(f"Unexpected error during parsing: {str(e)}")
954
+ self.buffer = self.buffer[1:]
955
+ self.error_count += 1
956
+ if self.error_count > self.max_errors:
957
+ logger.error("Too many parsing errors, clearing buffer")
958
+ self.buffer = b''
959
+
960
+ if events:
961
+ self.error_count = 0
962
+
963
+ return events
964
+
965
+ # Simple fallback parser for basic responses
966
+ class SimpleResponseParser:
967
+ @staticmethod
968
+ def parse_event_stream_to_json(raw_data: bytes) -> Dict[str, Any]:
969
+ """Simple parser for fallback (from version 1)"""
970
+ try:
971
+ if isinstance(raw_data, bytes):
972
+ raw_str = raw_data.decode('utf-8', errors='ignore')
973
+ else:
974
+ raw_str = str(raw_data)
975
+
976
+ # Method 1: Look for JSON objects with content field
977
+ json_pattern = r'\{[^{}]*"content"[^{}]*\}'
978
+ matches = re.findall(json_pattern, raw_str, re.DOTALL)
979
+
980
+ if matches:
981
+ content_parts = []
982
+ for match in matches:
983
+ try:
984
+ data = json.loads(match)
985
+ if 'content' in data and data['content']:
986
+ content_parts.append(data['content'])
987
+ except json.JSONDecodeError:
988
+ continue
989
+ if content_parts:
990
+ full_content = ''.join(content_parts)
991
+ return {
992
+ "content": full_content,
993
+ "tokens": len(full_content.split())
994
+ }
995
+
996
+ # Method 2: Extract readable text
997
+ clean_text = re.sub(r'[\x00-\x08\x0B\x0C\x0E-\x1F\x7F]', '', raw_str)
998
+ clean_text = re.sub(r':event-type[^:]*:[^:]*:[^:]*:', '', clean_text)
999
+ clean_text = re.sub(r':content-type[^:]*:[^:]*:[^:]*:', '', clean_text)
1000
+
1001
+ meaningful_text = re.sub(r'[^\w\s\u4e00-\u9fff\u3040-\u309f\u30a0-\u30ff.,!?;:()"\'-]', '', clean_text)
1002
+ meaningful_text = re.sub(r'\s+', ' ', meaningful_text).strip()
1003
+
1004
+ if meaningful_text and len(meaningful_text) > 5:
1005
+ return {
1006
+ "content": meaningful_text,
1007
+ "tokens": len(meaningful_text.split())
1008
+ }
1009
+
1010
+ return {"content": "No readable content found", "tokens": 0}
1011
+
1012
+ except Exception as e:
1013
+ return {"content": f"Error parsing response: {str(e)}", "tokens": 0}
1014
+
1015
+ # API call to CodeWhisperer
1016
+ async def call_kiro_api(request: ChatCompletionRequest):
1017
+ """Make API call to Kiro/CodeWhisperer with token refresh handling"""
1018
+ token = token_manager.get_token()
1019
+ if not token:
1020
+ raise HTTPException(
1021
+ status_code=401,
1022
+ detail={
1023
+ "error": {
1024
+ "message": "No access token available",
1025
+ "type": "authentication_error",
1026
+ "param": None,
1027
+ "code": "invalid_api_key"
1028
+ }
1029
+ }
1030
+ )
1031
+
1032
+ request_data = build_codewhisperer_request(request)
1033
+
1034
+ headers = {
1035
+ "Authorization": f"Bearer {token}",
1036
+ "Content-Type": "application/json",
1037
+ "Accept": "text/event-stream" if request.stream else "application/json"
1038
+ }
1039
+
1040
+ try:
1041
+ async with httpx.AsyncClient() as client:
1042
+ response = await client.post(
1043
+ KIRO_BASE_URL,
1044
+ headers=headers,
1045
+ json=request_data,
1046
+ timeout=120
1047
+ )
1048
+
1049
+ logger.info(f"📤 RESPONSE STATUS: {response.status_code}")
1050
+
1051
+ if response.status_code == 403:
1052
+ logger.info("收到403响应,尝试刷新token...")
1053
+ new_token = await token_manager.refresh_tokens()
1054
+ if new_token:
1055
+ headers["Authorization"] = f"Bearer {new_token}"
1056
+ response = await client.post(
1057
+ KIRO_BASE_URL,
1058
+ headers=headers,
1059
+ json=request_data,
1060
+ timeout=120
1061
+ )
1062
+ logger.info(f"📤 RETRY RESPONSE STATUS: {response.status_code}")
1063
+ else:
1064
+ raise HTTPException(status_code=401, detail="Token refresh failed")
1065
+
1066
+ if response.status_code == 429:
1067
+ raise HTTPException(
1068
+ status_code=429,
1069
+ detail={
1070
+ "error": {
1071
+ "message": "Rate limit exceeded",
1072
+ "type": "rate_limit_error",
1073
+ "param": None,
1074
+ "code": "rate_limit_exceeded"
1075
+ }
1076
+ }
1077
+ )
1078
+
1079
+ response.raise_for_status()
1080
+ return response
1081
+
1082
+ except httpx.HTTPStatusError as e:
1083
+ logger.error(f"HTTP ERROR: {e.response.status_code} - {e.response.text}")
1084
+ raise HTTPException(
1085
+ status_code=503,
1086
+ detail={
1087
+ "error": {
1088
+ "message": f"API call failed: {str(e)}",
1089
+ "type": "api_error",
1090
+ "param": None,
1091
+ "code": "api_error"
1092
+ }
1093
+ }
1094
+ )
1095
+ except Exception as e:
1096
+ logger.error(f"API call failed: {str(e)}")
1097
+ raise HTTPException(
1098
+ status_code=503,
1099
+ detail={
1100
+ "error": {
1101
+ "message": f"API call failed: {str(e)}",
1102
+ "type": "api_error",
1103
+ "param": None,
1104
+ "code": "api_error"
1105
+ }
1106
+ }
1107
+ )
1108
+
1109
+ # Utility functions
1110
+ def estimate_tokens(text: str) -> int:
1111
+ """Rough token estimation"""
1112
+ return max(1, len(text) // 4)
1113
+
1114
+ def create_usage_stats(prompt_text: str, completion_text: str) -> Usage:
1115
+ """Create usage statistics"""
1116
+ prompt_tokens = estimate_tokens(prompt_text)
1117
+ completion_tokens = estimate_tokens(completion_text)
1118
+ return Usage(
1119
+ prompt_tokens=prompt_tokens,
1120
+ completion_tokens=completion_tokens,
1121
+ total_tokens=prompt_tokens + completion_tokens
1122
+ )
1123
+
1124
+ # API endpoints
1125
+ @app.get("/v1/models")
1126
+ async def list_models(api_key: str = Depends(verify_api_key)):
1127
+ """List available models"""
1128
+ return {
1129
+ "object": "list",
1130
+ "data": [
1131
+ {
1132
+ "id": model_id,
1133
+ "object": "model",
1134
+ "created": int(time.time()),
1135
+ "owned_by": "ki2api"
1136
+ }
1137
+ for model_id in MODEL_MAP.keys()
1138
+ ]
1139
+ }
1140
+
1141
+ @app.post("/v1/chat/completions")
1142
+ async def create_chat_completion(
1143
+ request: ChatCompletionRequest,
1144
+ api_key: str = Depends(verify_api_key)
1145
+ ):
1146
+ """Create a chat completion"""
1147
+ logger.info(f"📥 COMPLETE REQUEST: {request.model_dump_json(indent=2)}")
1148
+
1149
+ # Validate messages have content
1150
+ for i, msg in enumerate(request.messages):
1151
+ if msg.content is None and msg.role != "assistant":
1152
+ logger.warning(f"Message {i} with role '{msg.role}' has None content")
1153
+
1154
+ if request.model not in MODEL_MAP:
1155
+ raise HTTPException(
1156
+ status_code=400,
1157
+ detail={
1158
+ "error": {
1159
+ "message": f"The model '{request.model}' does not exist or you do not have access to it.",
1160
+ "type": "invalid_request_error",
1161
+ "param": "model",
1162
+ "code": "model_not_found"
1163
+ }
1164
+ }
1165
+ )
1166
+
1167
+ # 总是使用非流式响应,但根据请求类型返回不同格式
1168
+ response = await create_non_streaming_response(request)
1169
+
1170
+ if request.stream:
1171
+ # 将非流式响应转换为流式格式
1172
+ return await convert_to_streaming_response(response)
1173
+ else:
1174
+ return response
1175
+
1176
+
1177
+ async def convert_to_streaming_response(response: ChatCompletionResponse):
1178
+ """将非流式响应转换为流式格式返回"""
1179
+ async def generate_stream():
1180
+ # 使用原响应的ID和时间戳
1181
+ response_id = response.id
1182
+ created = response.created
1183
+ model = response.model
1184
+
1185
+ # 发送初始块 - role
1186
+ initial_chunk = ChatCompletionStreamResponse(
1187
+ id=response_id,
1188
+ model=model,
1189
+ created=created,
1190
+ choices=[StreamChoice(
1191
+ index=0,
1192
+ delta={"role": "assistant"},
1193
+ finish_reason=None
1194
+ )]
1195
+ )
1196
+ yield f"data: {initial_chunk.model_dump_json(exclude_none=True)}\n\n"
1197
+
1198
+ # 获取响应消息
1199
+ if response.choices and len(response.choices) > 0:
1200
+ message = response.choices[0].message
1201
+
1202
+ # 如果有工具调用,发送工具调用
1203
+ if message.tool_calls:
1204
+ for i, tool_call in enumerate(message.tool_calls):
1205
+ # 发送完整的工具调用作为一个块
1206
+ tool_chunk = ChatCompletionStreamResponse(
1207
+ id=response_id,
1208
+ model=model,
1209
+ created=created,
1210
+ choices=[StreamChoice(
1211
+ index=0,
1212
+ delta={
1213
+ "tool_calls": [{
1214
+ "index": i,
1215
+ "id": tool_call.id,
1216
+ "type": tool_call.type,
1217
+ "function": tool_call.function
1218
+ }]
1219
+ },
1220
+ finish_reason=None
1221
+ )]
1222
+ )
1223
+ yield f"data: {tool_chunk.model_dump_json(exclude_none=True)}\n\n"
1224
+
1225
+ # 如果有内容,分块发送内容
1226
+ elif message.content:
1227
+ # 将内容分成较小的块以模拟流式传输
1228
+ content = message.content
1229
+ chunk_size = 50 # 每个块的字符数
1230
+
1231
+ for i in range(0, len(content), chunk_size):
1232
+ chunk_text = content[i:i + chunk_size]
1233
+ content_chunk = ChatCompletionStreamResponse(
1234
+ id=response_id,
1235
+ model=model,
1236
+ created=created,
1237
+ choices=[StreamChoice(
1238
+ index=0,
1239
+ delta={"content": chunk_text},
1240
+ finish_reason=None
1241
+ )]
1242
+ )
1243
+ yield f"data: {content_chunk.model_dump_json(exclude_none=True)}\n\n"
1244
+ # 添加小延迟以模拟真实的流式传输
1245
+ await asyncio.sleep(0.01)
1246
+
1247
+ # 发送结束块
1248
+ finish_reason = response.choices[0].finish_reason
1249
+ end_chunk = ChatCompletionStreamResponse(
1250
+ id=response_id,
1251
+ model=model,
1252
+ created=created,
1253
+ choices=[StreamChoice(
1254
+ index=0,
1255
+ delta={},
1256
+ finish_reason=finish_reason
1257
+ )]
1258
+ )
1259
+ yield f"data: {end_chunk.model_dump_json(exclude_none=True)}\n\n"
1260
+
1261
+ # 发送流结束标记
1262
+ yield "data: [DONE]\n\n"
1263
+
1264
+ return StreamingResponse(
1265
+ generate_stream(),
1266
+ media_type="text/event-stream",
1267
+ headers={
1268
+ "Cache-Control": "no-cache",
1269
+ "Connection": "keep-alive",
1270
+ "Content-Type": "text/event-stream"
1271
+ }
1272
+ )
1273
+
1274
+ async def create_non_streaming_response(request: ChatCompletionRequest):
1275
+ """
1276
+ Handles non-streaming chat completion requests.
1277
+ It fetches the complete response from CodeWhisperer, parses it using
1278
+ CodeWhispererStreamParser, and constructs a single OpenAI-compatible
1279
+ ChatCompletionResponse. This version correctly handles tool calls by
1280
+ parsing both structured event data and bracket format in text.
1281
+ """
1282
+ try:
1283
+ logger.info("🚀 开始非流式响应生成...")
1284
+ response = await call_kiro_api(request)
1285
+
1286
+ # 添加详细的原始响应日志
1287
+ logger.info(f"📤 CodeWhisperer响应状态码: {response.status_code}")
1288
+ logger.info(f"📤 响应头: {dict(response.headers)}")
1289
+ logger.info(f"📤 原始响应体长度: {len(response.content)} bytes")
1290
+
1291
+ # 获取原始响应文本用于工具调用检测
1292
+ raw_response_text = ""
1293
+ try:
1294
+ raw_response_text = response.content.decode('utf-8', errors='ignore')
1295
+ logger.info(f"🔍 原始响应文本长度: {len(raw_response_text)}")
1296
+ logger.info(f"🔍 原始响应预览(前1000字符): {raw_response_text[:1000]}")
1297
+
1298
+ # 检查是否包含工具调用标记
1299
+ if "[Called" in raw_response_text:
1300
+ logger.info("✅ 原始响应中发现 [Called 标记")
1301
+ called_positions = [m.start() for m in re.finditer(r'\[Called', raw_response_text)]
1302
+ logger.info(f"🎯 [Called 出现位置: {called_positions}")
1303
+ else:
1304
+ logger.info("❌ 原始响应中未发现 [Called 标记")
1305
+
1306
+ except Exception as e:
1307
+ logger.error(f"❌ 解码原始响应失败: {e}")
1308
+
1309
+ # 使用 CodeWhispererStreamParser 一次性解析整个响应体
1310
+ parser = CodeWhispererStreamParser()
1311
+ events = parser.parse(response.content)
1312
+
1313
+ full_response_text = ""
1314
+ tool_calls = []
1315
+ current_tool_call_dict = None
1316
+
1317
+ logger.info(f"🔄 解析到 {len(events)} 个事件,开始处理...")
1318
+
1319
+ # 记录每个事件的详细信息
1320
+ for i, event in enumerate(events):
1321
+ logger.info(f"📋 事件 {i}: {event}")
1322
+
1323
+ for event in events:
1324
+ # 优先处理结构化工具调用事件
1325
+ if "name" in event and "toolUseId" in event:
1326
+ logger.info(f"🔧 发现结构化工具调用事件: {event}")
1327
+ # 如果是新的工具调用,则初始化
1328
+ if not current_tool_call_dict:
1329
+ current_tool_call_dict = {
1330
+ "id": event.get("toolUseId"),
1331
+ "type": "function",
1332
+ "function": {
1333
+ "name": event.get("name"),
1334
+ "arguments": ""
1335
+ }
1336
+ }
1337
+ logger.info(f"🆕 开始解析工具调用: {current_tool_call_dict['function']['name']}")
1338
+
1339
+ # 累积参数
1340
+ if "input" in event:
1341
+ current_tool_call_dict["function"]["arguments"] += event.get("input", "")
1342
+ logger.info(f"📝 累积参数: {event.get('input', '')}")
1343
+
1344
+ # 工具调用结束
1345
+ if event.get("stop"):
1346
+ logger.info(f"✅ 完成工具调用: {current_tool_call_dict['function']['name']}")
1347
+ # 验证并标准化参数为JSON字符串
1348
+ try:
1349
+ args = json.loads(current_tool_call_dict["function"]["arguments"])
1350
+ current_tool_call_dict["function"]["arguments"] = json.dumps(args, ensure_ascii=False)
1351
+ logger.info(f"✅ 工具调用参数验证成功")
1352
+ except json.JSONDecodeError as e:
1353
+ logger.warning(f"⚠️ 工具调用的参数不是有效的JSON: {current_tool_call_dict['function']['arguments']}")
1354
+ logger.warning(f"⚠️ JSON错误: {e}")
1355
+
1356
+ tool_calls.append(ToolCall(**current_tool_call_dict))
1357
+ current_tool_call_dict = None # 重置以备下一个
1358
+
1359
+ # 处理普通文本内容事件
1360
+ elif "content" in event:
1361
+ content = event.get("content", "")
1362
+ full_response_text += content
1363
+ logger.info(f"📄 添加文本内容: {content[:100]}...")
1364
+
1365
+ # 如果流在工具调用中间意外结束,也将其添加
1366
+ if current_tool_call_dict:
1367
+ logger.warning("⚠️ 响应流在工具调用结束前终止,仍尝试添加。")
1368
+ tool_calls.append(ToolCall(**current_tool_call_dict))
1369
+
1370
+ logger.info(f"📊 事件处理完成 - 文本长度: {len(full_response_text)}, 结构化工具调用: {len(tool_calls)}")
1371
+
1372
+ # 检查解析后文本中的 bracket 格式工具调用
1373
+ logger.info("🔍 开始检查解析后文本中的bracket格式工具调用...")
1374
+ bracket_tool_calls = parse_bracket_tool_calls(full_response_text)
1375
+ if bracket_tool_calls:
1376
+ logger.info(f"✅ 在解析后文本中发现 {len(bracket_tool_calls)} 个 bracket 格式工具调用")
1377
+ tool_calls.extend(bracket_tool_calls)
1378
+
1379
+ # 从响应文本中移除工具调用文本
1380
+ for tc in bracket_tool_calls:
1381
+ # 构建精确的正则表达式来匹配这个特定的工具调用
1382
+ func_name = tc.function.get("name", "unknown")
1383
+ # 转义函数名中的特殊字符
1384
+ escaped_name = re.escape(func_name)
1385
+ # 匹配 [Called FunctionName with args: {...}]
1386
+ pattern = r'\[Called\s+' + escaped_name + r'\s+with\s+args:\s*\{[^}]*(?:\{[^}]*\}[^}]*)*\}\s*\]'
1387
+ full_response_text = re.sub(pattern, '', full_response_text, flags=re.DOTALL)
1388
+
1389
+ # 清理多余的空白
1390
+ full_response_text = re.sub(r'\s+', ' ', full_response_text).strip()
1391
+
1392
+ # 关键修复:检查原始响应中的 bracket 格式工具调用
1393
+ logger.info("🔍 开始检查原始响应中的bracket格式工具调用...")
1394
+ raw_bracket_tool_calls = parse_bracket_tool_calls(raw_response_text)
1395
+ if raw_bracket_tool_calls and isinstance(raw_bracket_tool_calls, list):
1396
+ logger.info(f"✅ 在原始响应中发现 {len(raw_bracket_tool_calls)} 个 bracket 格式工具调用")
1397
+ tool_calls.extend(raw_bracket_tool_calls)
1398
+ else:
1399
+ logger.info("❌ 原始响应中未发现bracket格式工具调用")
1400
+
1401
+ # 去重工具调用
1402
+ logger.info(f"🔄 去重前工具调用数量: {len(tool_calls)}")
1403
+ unique_tool_calls = deduplicate_tool_calls(tool_calls)
1404
+ logger.info(f"🔄 去重后工具调用数量: {len(unique_tool_calls)}")
1405
+
1406
+ # 根据是否有工具调用来构建响应
1407
+ if unique_tool_calls:
1408
+ logger.info(f"🔧 构建工具调用响应,包含 {len(unique_tool_calls)} 个工具调用")
1409
+ for i, tc in enumerate(unique_tool_calls):
1410
+ logger.info(f"🔧 工具调用 {i}: {tc.function.get('name', 'unknown')}")
1411
+
1412
+ response_message = ResponseMessage(
1413
+ role="assistant",
1414
+ content=None, # OpenAI规范:当有tool_calls时,content必须为None
1415
+ tool_calls=unique_tool_calls
1416
+ )
1417
+ finish_reason = "tool_calls"
1418
+ else:
1419
+ logger.info("📄 构建普通文本响应")
1420
+ # 如果没有工具调用,使用清理后的文本
1421
+ content = full_response_text.strip() if full_response_text.strip() else "I understand."
1422
+ logger.info(f"📄 最终文本内容: {content[:200]}...")
1423
+
1424
+ response_message = ResponseMessage(
1425
+ role="assistant",
1426
+ content=content
1427
+ )
1428
+ finish_reason = "stop"
1429
+
1430
+ choice = Choice(
1431
+ index=0,
1432
+ message=response_message,
1433
+ finish_reason=finish_reason
1434
+ )
1435
+
1436
+ usage = create_usage_stats(
1437
+ prompt_text=" ".join([msg.get_content_text() for msg in request.messages]),
1438
+ completion_text=full_response_text if not unique_tool_calls else ""
1439
+ )
1440
+
1441
+ chat_response = ChatCompletionResponse(
1442
+ model=request.model,
1443
+ choices=[choice],
1444
+ usage=usage
1445
+ )
1446
+
1447
+ logger.info(f"📤 最终非流式响应构建完成")
1448
+ logger.info(f"📤 响应类型: {'工具调用' if unique_tool_calls else '文本内容'}")
1449
+ logger.info(f"📤 完整响应: {chat_response.model_dump_json(indent=2, exclude_none=True)}")
1450
+ return chat_response
1451
+
1452
+ except HTTPException:
1453
+ raise
1454
+ except Exception as e:
1455
+ logger.error(f"❌ 非流式响应处理出错: {e}")
1456
+ import traceback
1457
+ traceback.print_exc()
1458
+ raise HTTPException(
1459
+ status_code=500,
1460
+ detail={
1461
+ "error": {
1462
+ "message": f"Internal server error: {str(e)}",
1463
+ "type": "internal_server_error",
1464
+ "param": None,
1465
+ "code": "internal_error"
1466
+ }
1467
+ }
1468
+ )
1469
+
1470
+ async def create_streaming_response(request: ChatCompletionRequest):
1471
+ """
1472
+ Handles streaming chat completion requests.
1473
+ This function iteratively processes the binary event stream from CodeWhisperer,
1474
+ parsing events on the fly. It maintains state to correctly identify and
1475
+ stream text content or tool calls in the OpenAI-compatible format.
1476
+ """
1477
+ try:
1478
+ logger.info("开始流式响应生成...")
1479
+ response = await call_kiro_api(request)
1480
+
1481
+ async def generate_stream():
1482
+ response_id = f"chatcmpl-{uuid.uuid4()}"
1483
+ created = int(time.time())
1484
+ parser = CodeWhispererStreamParser()
1485
+
1486
+ # --- 状态变量 ---
1487
+ is_in_tool_call = False
1488
+ sent_role = False
1489
+ current_tool_call_index = 0
1490
+ streamed_tool_calls_count = 0
1491
+ content_buffer = "" # 用于累积文本内容
1492
+ incomplete_tool_call = "" # 用于累积不完整的工具调用
1493
+
1494
+ async for chunk in response.aiter_bytes():
1495
+ events = parser.parse(chunk)
1496
+
1497
+ for event in events:
1498
+ # --- 处理结构化工具调用事件 ---
1499
+ if "name" in event and "toolUseId" in event:
1500
+ logger.info(f"🎯 STREAM: Found structured tool call event: {event}")
1501
+ # 开始一个新的工具调用
1502
+ if not is_in_tool_call:
1503
+ is_in_tool_call = True
1504
+
1505
+ # 发送工具调用开始的 chunk
1506
+ delta_start = {
1507
+ "tool_calls": [{
1508
+ "index": current_tool_call_index,
1509
+ "id": event.get("toolUseId"),
1510
+ "type": "function",
1511
+ "function": {"name": event.get("name"), "arguments": ""}
1512
+ }]
1513
+ }
1514
+ # 如果是第一个数据块,需要包含 role: assistant
1515
+ if not sent_role:
1516
+ delta_start["role"] = "assistant"
1517
+ sent_role = True
1518
+
1519
+ start_chunk = ChatCompletionStreamResponse(
1520
+ id=response_id, model=request.model, created=created,
1521
+ choices=[StreamChoice(index=0, delta=delta_start)]
1522
+ )
1523
+ yield f"data: {start_chunk.model_dump_json(exclude_none=True)}\n\n"
1524
+
1525
+ # 累积工具调用的参数
1526
+ if "input" in event:
1527
+ arg_chunk_str = event.get("input", "")
1528
+ if arg_chunk_str:
1529
+ arg_chunk_delta = {
1530
+ "tool_calls": [{
1531
+ "index": current_tool_call_index,
1532
+ "function": {"arguments": arg_chunk_str}
1533
+ }]
1534
+ }
1535
+ arg_chunk_resp = ChatCompletionStreamResponse(
1536
+ id=response_id, model=request.model, created=created,
1537
+ choices=[StreamChoice(index=0, delta=arg_chunk_delta)]
1538
+ )
1539
+ yield f"data: {arg_chunk_resp.model_dump_json(exclude_none=True)}\n\n"
1540
+
1541
+ # 结束一个工具调用
1542
+ if event.get("stop"):
1543
+ is_in_tool_call = False
1544
+ current_tool_call_index += 1
1545
+ streamed_tool_calls_count += 1
1546
+
1547
+ # --- 处理普通文本内容事件 ---
1548
+ elif "content" in event and not is_in_tool_call:
1549
+ content_text = event.get("content", "")
1550
+ if content_text:
1551
+ content_buffer += content_text
1552
+ logger.info(f"📝 STREAM DEBUG: Buffer updated. Length: {len(content_buffer)}. Content: >>>{content_buffer}<<<")
1553
+ logger.info(f"📝 STREAM DEBUG: incomplete_tool_call: >>>{incomplete_tool_call}<<<")
1554
+
1555
+ # 处理bracket格式的工具调用
1556
+ while True:
1557
+ # 查找 [Called 的开始位置
1558
+ called_start = content_buffer.find("[Called")
1559
+ logger.info(f"🔍 BRACKET DEBUG: Searching for [Called in buffer (length={len(content_buffer)})")
1560
+ logger.info(f"🔍 BRACKET DEBUG: called_start={called_start}")
1561
+ logger.info(f"🔍 BRACKET DEBUG: Full buffer content: >>>{content_buffer}<<<")
1562
+
1563
+ if called_start == -1:
1564
+ # 没有工具调用,发送所有内容
1565
+ logger.info(f"🔍 BRACKET DEBUG: No [Called found, sending buffer as content")
1566
+ logger.info(f"🔍 BRACKET DEBUG: incomplete_tool_call status: {bool(incomplete_tool_call)}")
1567
+ if content_buffer and not incomplete_tool_call:
1568
+ delta_content = {"content": content_buffer}
1569
+ if not sent_role:
1570
+ delta_content["role"] = "assistant"
1571
+ sent_role = True
1572
+
1573
+ logger.info(f"📤 STREAM: Sending content chunk: {delta_content}")
1574
+ content_chunk = ChatCompletionStreamResponse(
1575
+ id=response_id, model=request.model, created=created,
1576
+ choices=[StreamChoice(index=0, delta=delta_content)]
1577
+ )
1578
+ yield f"data: {content_chunk.model_dump_json(exclude_none=True)}\n\n"
1579
+ content_buffer = ""
1580
+ break
1581
+
1582
+ logger.info(f"🔍 BRACKET DEBUG: Found [Called at position {called_start}")
1583
+
1584
+ # 发送 [Called 之前的文本
1585
+ if called_start > 0:
1586
+ text_before = content_buffer[:called_start]
1587
+ logger.info(f"🔍 BRACKET DEBUG: Text before [Called: >>>{text_before}<<<")
1588
+ if text_before.strip():
1589
+ delta_content = {"content": text_before}
1590
+ if not sent_role:
1591
+ delta_content["role"] = "assistant"
1592
+ sent_role = True
1593
+
1594
+ content_chunk = ChatCompletionStreamResponse(
1595
+ id=response_id, model=request.model, created=created,
1596
+ choices=[StreamChoice(index=0, delta=delta_content)]
1597
+ )
1598
+ yield f"data: {content_chunk.model_dump_json(exclude_none=True)}\n\n"
1599
+
1600
+ # 查找对应的结束 ]
1601
+ remaining_text = content_buffer[called_start:]
1602
+ logger.info(f"🔍 BRACKET DEBUG: Looking for matching ] in: >>>{remaining_text[:100]}...<<<")
1603
+ bracket_end = find_matching_bracket(remaining_text, 0)
1604
+ logger.info(f"🔍 BRACKET DEBUG: bracket_end={bracket_end}")
1605
+
1606
+ if bracket_end == -1:
1607
+ # 工具调用不完整,保留在缓冲区
1608
+ logger.info(f"🔍 BRACKET DEBUG: Tool call incomplete, saving to incomplete_tool_call")
1609
+ logger.info(f"🔍 BRACKET DEBUG: Incomplete content: >>>{remaining_text}<<<")
1610
+ incomplete_tool_call = remaining_text
1611
+ content_buffer = ""
1612
+ break
1613
+
1614
+ # 提取完整的工具调用
1615
+ tool_call_text = remaining_text[:bracket_end + 1]
1616
+ logger.info(f"🔍 BRACKET DEBUG: Extracting tool call: >>>{tool_call_text}<<<")
1617
+ parsed_call = parse_single_tool_call(tool_call_text)
1618
+ logger.info(f"🔍 BRACKET DEBUG: Parsed call result: {parsed_call}")
1619
+
1620
+ if parsed_call:
1621
+ # 发送工具调用
1622
+ delta_tool = {
1623
+ "tool_calls": [{
1624
+ "index": current_tool_call_index,
1625
+ "id": parsed_call.id,
1626
+ "type": "function",
1627
+ "function": {
1628
+ "name": parsed_call.function["name"],
1629
+ "arguments": parsed_call.function["arguments"]
1630
+ }
1631
+ }]
1632
+ }
1633
+ if not sent_role:
1634
+ delta_tool["role"] = "assistant"
1635
+ sent_role = True
1636
+
1637
+ logger.info(f"📤 STREAM: Sending tool call chunk: {delta_tool}")
1638
+ tool_chunk = ChatCompletionStreamResponse(
1639
+ id=response_id, model=request.model, created=created,
1640
+ choices=[StreamChoice(index=0, delta=delta_tool)]
1641
+ )
1642
+ yield f"data: {tool_chunk.model_dump_json(exclude_none=True)}\n\n"
1643
+ current_tool_call_index += 1
1644
+ streamed_tool_calls_count += 1
1645
+ else:
1646
+ logger.error(f"❌ BRACKET DEBUG: Failed to parse tool call")
1647
+
1648
+ # 更新缓冲区
1649
+ content_buffer = remaining_text[bracket_end + 1:]
1650
+ incomplete_tool_call = ""
1651
+ logger.info(f"🔍 BRACKET DEBUG: Updated buffer after tool call: >>>{content_buffer}<<<")
1652
+
1653
+ # 处理剩余的内容
1654
+ logger.info(f"📊 STREAM END: Processing remaining content")
1655
+ logger.info(f"📊 STREAM END: incomplete_tool_call: >>>{incomplete_tool_call}<<<")
1656
+ logger.info(f"📊 STREAM END: content_buffer: >>>{content_buffer}<<<")
1657
+
1658
+ if incomplete_tool_call:
1659
+ # 尝试再次解析不完整的工具调用(可能现在已经完整了)
1660
+ logger.info(f"🔄 STREAM END: Attempting to parse incomplete tool call")
1661
+ content_buffer = incomplete_tool_call + content_buffer
1662
+ incomplete_tool_call = ""
1663
+
1664
+ # 重复上面的解析逻辑
1665
+ called_start = content_buffer.find("[Called")
1666
+ if called_start == 0:
1667
+ bracket_end = find_matching_bracket(content_buffer, 0)
1668
+ logger.info(f"🔄 STREAM END: bracket_end for incomplete={bracket_end}")
1669
+ if bracket_end != -1:
1670
+ tool_call_text = content_buffer[:bracket_end + 1]
1671
+ parsed_call = parse_single_tool_call(tool_call_text)
1672
+
1673
+ if parsed_call:
1674
+ delta_tool = {
1675
+ "tool_calls": [{
1676
+ "index": current_tool_call_index,
1677
+ "id": parsed_call.id,
1678
+ "type": "function",
1679
+ "function": {
1680
+ "name": parsed_call.function["name"],
1681
+ "arguments": parsed_call.function["arguments"]
1682
+ }
1683
+ }]
1684
+ }
1685
+ if not sent_role:
1686
+ delta_tool["role"] = "assistant"
1687
+ sent_role = True
1688
+
1689
+ logger.info(f"📤 STREAM END: Sending final tool call: {delta_tool}")
1690
+ tool_chunk = ChatCompletionStreamResponse(
1691
+ id=response_id, model=request.model, created=created,
1692
+ choices=[StreamChoice(index=0, delta=delta_tool)]
1693
+ )
1694
+ yield f"data: {tool_chunk.model_dump_json(exclude_none=True)}\n\n"
1695
+ current_tool_call_index += 1
1696
+ streamed_tool_calls_count += 1
1697
+
1698
+ content_buffer = content_buffer[bracket_end + 1:]
1699
+
1700
+ # 发送任何剩余的内容
1701
+ if content_buffer.strip():
1702
+ logger.info(f"📤 STREAM END: Sending remaining content: >>>{content_buffer}<<<")
1703
+ delta_content = {"content": content_buffer}
1704
+ if not sent_role:
1705
+ delta_content["role"] = "assistant"
1706
+ sent_role = True
1707
+
1708
+ content_chunk = ChatCompletionStreamResponse(
1709
+ id=response_id, model=request.model, created=created,
1710
+ choices=[StreamChoice(index=0, delta=delta_content)]
1711
+ )
1712
+ yield f"data: {content_chunk.model_dump_json(exclude_none=True)}\n\n"
1713
+
1714
+ # --- 流结束 ---
1715
+ finish_reason = "tool_calls" if streamed_tool_calls_count > 0 else "stop"
1716
+ logger.info(f"🏁 STREAM FINISH: streamed_tool_calls_count={streamed_tool_calls_count}, finish_reason={finish_reason}")
1717
+ end_chunk = ChatCompletionStreamResponse(
1718
+ id=response_id, model=request.model, created=created,
1719
+ choices=[StreamChoice(index=0, delta={}, finish_reason=finish_reason)]
1720
+ )
1721
+ yield f"data: {end_chunk.model_dump_json(exclude_none=True)}\n\n"
1722
+
1723
+ yield "data: [DONE]\n\n"
1724
+
1725
+ return StreamingResponse(
1726
+ generate_stream(),
1727
+ media_type="text/event-stream",
1728
+ headers={
1729
+ "Cache-Control": "no-cache",
1730
+ "Connection": "keep-alive",
1731
+ "Content-Type": "text/event-stream"
1732
+ }
1733
+ )
1734
+
1735
+ except Exception as e:
1736
+ logger.error(f"❌ 流式响应生成失败: {str(e)}")
1737
+ import traceback
1738
+ traceback.print_exc()
1739
+ raise HTTPException(
1740
+ status_code=500,
1741
+ detail={
1742
+ "error": {
1743
+ "message": f"Stream generation failed: {str(e)}",
1744
+ "type": "internal_server_error",
1745
+ "param": None,
1746
+ "code": "stream_error"
1747
+ }
1748
+ }
1749
+ )
1750
+
1751
+ @app.get("/health")
1752
+ async def health_check():
1753
+ """Health check endpoint"""
1754
+ return {"status": "healthy", "service": "Ki2API", "version": "3.0.1"}
1755
+
1756
+ @app.get("/")
1757
+ async def root():
1758
+ """Root endpoint with service information"""
1759
+ return {
1760
+ "service": "Ki2API",
1761
+ "description": "OpenAI-compatible API for Claude Sonnet 4 via AWS CodeWhisperer",
1762
+ "version": "3.0.1",
1763
+ "endpoints": {
1764
+ "models": "/v1/models",
1765
+ "chat": "/v1/chat/completions",
1766
+ "health": "/health"
1767
+ },
1768
+ "features": {
1769
+ "streaming": True,
1770
+ "tools": True,
1771
+ "multiple_models": True,
1772
+ "xml_tool_parsing": True,
1773
+ "auto_token_refresh": True,
1774
+ "null_content_handling": True,
1775
+ "tool_call_deduplication": True
1776
+ }
1777
+ }
1778
+
1779
+ if __name__ == "__main__":
1780
+ import uvicorn
1781
+ import os
1782
+ port = int(os.getenv("PORT", 7860))
1783
+ uvicorn.run(app, host="0.0.0.0", port=port)
requirements.txt ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ fastapi==0.104.1
2
+ uvicorn[standard]==0.24.0
3
+ httpx==0.25.2
4
+ python-dotenv==1.0.0
5
+ pydantic==2.5.0
6
+ json_repair==0.48.0