Spaces:

keisanmono
/

vertextoopenai

Running

App Files Files Community

bibibi12345 commited on May 10

Commit

ebf9765

1 Parent(s): 4118a69

updated readme

Browse files

Files changed (1) hide show

README.md +63 -121

README.md CHANGED Viewed

@@ -4,55 +4,73 @@ emoji: 🔄☁️
 colorFrom: blue
 colorTo: green
 sdk: docker
-app_port: 7860
 ---
 # OpenAI to Gemini Adapter
-This service provides an OpenAI-compatible API that translates requests to Google's Vertex AI Gemini models, allowing you to use Gemini models with tools expecting an OpenAI interface.
 ## Features
 -   OpenAI-compatible API endpoints (`/v1/chat/completions`, `/v1/models`).
--   Supports Google Cloud credentials via `GOOGLE_CREDENTIALS_JSON` secret (recommended for Spaces) or local file methods.
--   Supports credential rotation when using local files.
 -   Handles streaming and non-streaming responses.
 -   Configured for easy deployment on Hugging Face Spaces using Docker (port 7860) or locally via Docker Compose (port 8050).
 ## Hugging Face Spaces Deployment (Recommended)
 This application is ready for deployment on Hugging Face Spaces using Docker.
 1.  **Create a new Space:** Go to Hugging Face Spaces and create a new Space, choosing "Docker" as the Space SDK.
-2.  **Upload Files:** Upload the `app/` directory, `Dockerfile`, and `app/requirements.txt` to your Space repository. You can do this via the web interface or using Git.
-3.  **Configure Secrets:** In your Space settings, navigate to the **Secrets** section and add the following secrets:
-    *   `API_KEY`: Your desired API key for authenticating requests to this adapter service. If not set, it defaults to `123456`.
-    *   `GOOGLE_CREDENTIALS_JSON`: The **entire content** of your Google Cloud service account JSON key file. Copy and paste the JSON content directly into the secret value field. **This is the required method for providing credentials on Hugging Face.**
-4.  **Deployment:** Hugging Face will automatically build and deploy the Docker container. The application will run on port 7860 as defined in the `Dockerfile` and this README's metadata.
-Your adapter service will be available at the URL provided by your Hugging Face Space (e.g., `https://your-user-name-your-space-name.hf.space`).
 ## Local Docker Setup (for Development/Testing)
 ### Prerequisites
 -   Docker and Docker Compose
--   Google Cloud service account credentials with Vertex AI access
 ### Credential Setup (Local Docker)
-1.  Create a `credentials` directory in the project root:
-    ```bash
-    mkdir -p credentials
-    ```
-2.  Add your service account JSON files to the `credentials` directory:
-    ```bash
-    # Example with multiple credential files
-    cp /path/to/your/service-account1.json credentials/service-account1.json
-    cp /path/to/your/service-account2.json credentials/service-account2.json
-    ```
-    The service will automatically detect and rotate through all `.json` files in this directory if the `GOOGLE_CREDENTIALS_JSON` environment variable is *not* set.
-3.  Alternatively, set the `GOOGLE_APPLICATION_CREDENTIALS` environment variable *in your local environment or `docker-compose.yml`* to the *path* of a single credential file (used as a fallback if the other methods fail).
 ### Running Locally
@@ -61,7 +79,6 @@ Start the service using Docker Compose:
 ```bash
 docker-compose up -d
 ```
 The service will be available at `http://localhost:8050` (as defined in `docker-compose.yml`).
 ## API Usage
@@ -70,34 +87,27 @@ The service implements OpenAI-compatible endpoints:
 -   `GET /v1/models` - List available models
 -   `POST /v1/chat/completions` - Create a chat completion
--   `GET /health` - Health check endpoint (includes credential status)
-All endpoints require authentication using an API key in the Authorization header.
 ### Authentication
-The service requires an API key for authentication.
-To authenticate, include the API key in the `Authorization` header using the `Bearer` token format:
-```
-Authorization: Bearer YOUR_API_KEY
-```
-Replace `YOUR_API_KEY` with the key you configured (either via the `API_KEY` secret/environment variable or the default `123456`).
 ### Example Requests
 *(Replace `YOUR_ADAPTER_URL` with your Hugging Face Space URL or `http://localhost:8050` if running locally)*
 #### Basic Request
 ```bash
 curl -X POST YOUR_ADAPTER_URL/v1/chat/completions \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer YOUR_API_KEY" \
   -d '{
-    "model": "gemini-1.5-pro",
     "messages": [
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Hello, how are you?"}
@@ -106,97 +116,29 @@ curl -X POST YOUR_ADAPTER_URL/v1/chat/completions \
   }'
 ```
-#### Grounded Search Request
-```bash
-curl -X POST YOUR_ADAPTER_URL/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer YOUR_API_KEY" \
-  -d '{
-    "model": "gemini-2.5-pro-exp-03-25-search",
-    "messages": [
-      {"role": "system", "content": "You are a helpful assistant with access to the latest information."},
-      {"role": "user", "content": "What are the latest developments in quantum computing?"}
-    ],
-    "temperature": 0.2
-  }'
-```
-### Supported Models
-The API supports the following Vertex AI Gemini models:
-| Model ID                       | Description                                    |
-| ------------------------------ | ---------------------------------------------- |
-| `gemini-2.5-pro-exp-03-25`     | Gemini 2.5 Pro Experimental (March 25)         |
-| `gemini-2.5-pro-exp-03-25-search` | Gemini 2.5 Pro with Google Search grounding |
-| `gemini-2.0-flash`             | Gemini 2.0 Flash                             |
-| `gemini-2.0-flash-search`      | Gemini 2.0 Flash with Google Search grounding |
-| `gemini-2.0-flash-lite`        | Gemini 2.0 Flash Lite                          |
-| `gemini-2.0-flash-lite-search` | Gemini 2.0 Flash Lite with Google Search grounding |
-| `gemini-2.0-pro-exp-02-05`     | Gemini 2.0 Pro Experimental (February 5)      |
-| `gemini-1.5-flash`             | Gemini 1.5 Flash                             |
-| `gemini-1.5-flash-8b`          | Gemini 1.5 Flash 8B                          |
-| `gemini-1.5-pro`             | Gemini 1.5 Pro                               |
-| `gemini-1.0-pro-002`           | Gemini 1.0 Pro                               |
-| `gemini-1.0-pro-vision-001`    | Gemini 1.0 Pro Vision                        |
-| `gemini-embedding-exp`         | Gemini Embedding Experimental                |
-Models with the `-search` suffix enable grounding with Google Search using dynamic retrieval.
-### Supported Parameters
-The API supports common OpenAI-compatible parameters, mapping them to Vertex AI where possible:
-| OpenAI Parameter    | Vertex AI Parameter | Description                                       |
-| ------------------- | --------------------- | ------------------------------------------------- |
-| `temperature`       | `temperature`         | Controls randomness (0.0 to 1.0)                  |
-| `max_tokens`        | `max_output_tokens`   | Maximum number of tokens to generate              |
-| `top_p`             | `top_p`               | Nucleus sampling parameter (0.0 to 1.0)           |
-| `top_k`             | `top_k`               | Top-k sampling parameter                          |
-| `stop`              | `stop_sequences`      | List of strings that stop generation when encountered |
-| `presence_penalty`  | `presence_penalty`    | Penalizes repeated tokens                         |
-| `frequency_penalty` | `frequency_penalty`   | Penalizes frequent tokens                         |
-| `seed`              | `seed`                | Random seed for deterministic generation          |
-| `logprobs`          | `logprobs`            | Number of log probabilities to return           |
-| `n`                 | `candidate_count`     | Number of completions to generate               |
 ## Credential Handling Priority
-The application loads Google Cloud credentials in the following order:
-1.  **`GOOGLE_CREDENTIALS_JSON` Environment Variable / Secret:** Checks for the JSON *content* directly in this variable (Required for Hugging Face).
-2.  **`credentials/` Directory (Local Only):** Looks for `.json` files in the directory specified by `CREDENTIALS_DIR` (Default: `/app/credentials` inside the container). Rotates through found files. Used if `GOOGLE_CREDENTIALS_JSON` is not set.
-3.  **`GOOGLE_APPLICATION_CREDENTIALS` Environment Variable (Local Only):** Checks for a *file path* specified by this variable. Used as a fallback if the above methods fail.
-## Environment Variables / Secrets
--   `API_KEY`: API key for authentication (Default: `123456`). **Required as Secret on Hugging Face.**
--   `GOOGLE_CREDENTIALS_JSON`: **(Required Secret on Hugging Face)** The full JSON content of your service account key. Takes priority over other methods.
--   `CREDENTIALS_DIR` (Local Only): Directory containing credential files (Default: `/app/credentials` in the container). Used if `GOOGLE_CREDENTIALS_JSON` is not set.
--   `GOOGLE_APPLICATION_CREDENTIALS` (Local Only): Path to a *specific* credential file. Used as a fallback if the above methods fail.
--   `PORT`: Not needed for `CMD` config (uses 7860). Hugging Face provides this automatically, `docker-compose.yml` maps 8050 locally.
-## Health Check
-You can check the status of the service using the health endpoint:
-```bash
-curl YOUR_ADAPTER_URL/health -H "Authorization: Bearer YOUR_API_KEY"
-```
-This returns information about the credential status:
-```json
-{
-  "status": "ok",
-  "credentials": {
-    "available": 1, // Example: 1 if loaded via JSON secret, or count if loaded from files
-    "files": [], // Lists files only if using CREDENTIALS_DIR method
-    "current_index": 0
-  }
-}
-```
 ## License

 colorFrom: blue
 colorTo: green
 sdk: docker
+app_port: 7860 # Port exposed by Dockerfile, used by Hugging Face Spaces
 ---
 # OpenAI to Gemini Adapter
+This service provides an OpenAI-compatible API that translates requests to Google's Vertex AI Gemini models, allowing you to use Gemini models with tools expecting an OpenAI interface. The codebase has been refactored for modularity and improved maintainability.
 ## Features
 -   OpenAI-compatible API endpoints (`/v1/chat/completions`, `/v1/models`).
+-   Modular codebase located within the `app/` directory.
+-   Centralized environment variable management in `app/config.py`.
+-   Supports Google Cloud credentials via:
+    -   `GOOGLE_CREDENTIALS_JSON` environment variable (containing the JSON key content).
+    -   Service account JSON files placed in a specified directory (defaults to `credentials/` in the project root, mapped to `/app/credentials` in the container).
+-   Supports credential rotation when using multiple local credential files.
 -   Handles streaming and non-streaming responses.
 -   Configured for easy deployment on Hugging Face Spaces using Docker (port 7860) or locally via Docker Compose (port 8050).
+-   Support for Vertex AI Express Mode via `VERTEX_EXPRESS_API_KEY` environment variable.
 ## Hugging Face Spaces Deployment (Recommended)
 This application is ready for deployment on Hugging Face Spaces using Docker.
 1.  **Create a new Space:** Go to Hugging Face Spaces and create a new Space, choosing "Docker" as the Space SDK.
+2.  **Upload Files:** Add all project files (including the `app/` directory, `.gitignore`, `Dockerfile`, `docker-compose.yml`, and `requirements.txt`) to your Space repository. You can do this via the web interface or using Git.
+3.  **Configure Secrets:** In your Space settings, navigate to the **Secrets** section and add the following:
+    *   `API_KEY`: Your desired API key for authenticating requests to this adapter service. (Default: `123456` if not set, as per `app/config.py`).
+    *   `GOOGLE_CREDENTIALS_JSON`: The **entire content** of your Google Cloud service account JSON key file. This is the primary method for providing credentials on Hugging Face.
+    *   `VERTEX_EXPRESS_API_KEY` (Optional): If you have a Vertex AI Express API key and want to use eligible models in Express Mode.
+    *   Other environment variables (see "Environment Variables" section below) can also be set as secrets if you need to override defaults (e.g., `FAKE_STREAMING`).
+4.  **Deployment:** Hugging Face will automatically build and deploy the Docker container. The application will run on port 7860.
+Your adapter service will be available at the URL provided by your Hugging Face Space.
 ## Local Docker Setup (for Development/Testing)
 ### Prerequisites
 -   Docker and Docker Compose
+-   Google Cloud service account credentials with Vertex AI access (if not using Vertex Express exclusively).
 ### Credential Setup (Local Docker)
+The application uses `app/config.py` to manage environment variables. You can set these in a `.env` file at the project root (which is ignored by git) or directly in your `docker-compose.yml` for local development.
+1.  **Method 1: JSON Content via Environment Variable (Recommended for consistency with Spaces)**
+    *   Set the `GOOGLE_CREDENTIALS_JSON` environment variable to the full JSON content of your service account key.
+2.  **Method 2: Credential Files in a Directory**
+    *   If `GOOGLE_CREDENTIALS_JSON` is *not* set, the adapter will look for service account JSON files in the directory specified by the `CREDENTIALS_DIR` environment variable.
+    *   The default `CREDENTIALS_DIR` is `/app/credentials` inside the container.
+    *   Create a `credentials` directory in your project root: `mkdir -p credentials`
+    *   Place your service account JSON key files (e.g., `my-project-creds.json`) into this `credentials/` directory. The `docker-compose.yml` mounts this local directory to `/app/credentials` in the container.
+    *   The service will automatically detect and rotate through all `.json` files in this directory.
+### Environment Variables for Local Docker (`.env` file or `docker-compose.yml`)
+Create a `.env` file in the project root or modify your `docker-compose.override.yml` (if you use one) or `docker-compose.yml` to set these:
+```env
+API_KEY="your_secure_api_key_here" # Replace with your actual key or leave for default
+# GOOGLE_CREDENTIALS_JSON='{"type": "service_account", ...}' # Option 1: Paste JSON content
+# CREDENTIALS_DIR="/app/credentials" # Option 2: (Default path if GOOGLE_CREDENTIALS_JSON is not set)
+# VERTEX_EXPRESS_API_KEY="your_vertex_express_key" # Optional
+# FAKE_STREAMING="false" # Optional, for debugging
+# FAKE_STREAMING_INTERVAL="1.0" # Optional, for debugging
+```
 ### Running Locally
 ```bash
 docker-compose up -d
 ```
 The service will be available at `http://localhost:8050` (as defined in `docker-compose.yml`).
 ## API Usage
 -   `GET /v1/models` - List available models
 -   `POST /v1/chat/completions` - Create a chat completion
+-   `GET /` - Basic status endpoint
+All API endpoints require authentication using an API key in the Authorization header.
 ### Authentication
+Include the API key in the `Authorization` header using the `Bearer` token format:
+`Authorization: Bearer YOUR_API_KEY`
+Replace `YOUR_API_KEY` with the key configured via the `API_KEY` environment variable (or the default).
 ### Example Requests
 *(Replace `YOUR_ADAPTER_URL` with your Hugging Face Space URL or `http://localhost:8050` if running locally)*
 #### Basic Request
 ```bash
 curl -X POST YOUR_ADAPTER_URL/v1/chat/completions \
   -H "Content-Type: application/json" \
   -H "Authorization: Bearer YOUR_API_KEY" \
   -d '{
+    "model": "gemini-1.5-pro", # Or any other supported model
     "messages": [
       {"role": "system", "content": "You are a helpful assistant."},
       {"role": "user", "content": "Hello, how are you?"}
   }'
 ```
+### Supported Models & Parameters
+(Refer to the `list_models` endpoint output and original documentation for the most up-to-date list of supported models and parameters. The adapter aims to map common OpenAI parameters to their Vertex AI equivalents.)
 ## Credential Handling Priority
+The application (via `app/config.py` and helper modules) prioritizes credentials as follows:
+1.  **Vertex AI Express Mode (`VERTEX_EXPRESS_API_KEY` env var):** If this key is set and the requested model is eligible for Express Mode, this will be used.
+2.  **Service Account Credentials (Rotated):** If Express Mode is not used/applicable:
+    *   **`GOOGLE_CREDENTIALS_JSON` Environment Variable:** If set, its JSON content is parsed. Multiple JSON objects (comma-separated) or a single JSON object are supported. These are loaded into the `CredentialManager`.
+    *   **Files in `CREDENTIALS_DIR`:** The `CredentialManager` scans the directory specified by `CREDENTIALS_DIR` (default is `credentials/` mapped to `/app/credentials` in Docker) for `.json` Mkey files.
+    *   The `CredentialManager` then rotates through all successfully loaded service account credentials (from `GOOGLE_CREDENTIALS_JSON` and files in `CREDENTIALS_DIR`) for each request.
+## Key Environment Variables
+These are sourced by `app/config.py`:
+-   `API_KEY`: API key for authenticating to this adapter service. (Default: `123456`)
+-   `GOOGLE_CREDENTIALS_JSON`: (Takes priority for SA creds) Full JSON content of your service account key(s).
+-   `CREDENTIALS_DIR`: Directory for service account JSON files if `GOOGLE_CREDENTIALS_JSON` is not set. (Default: `/app/credentials` within container context)
+-   `VERTEX_EXPRESS_API_KEY`: Optional API key for using Vertex AI Express Mode with compatible models.
+-   `FAKE_STREAMING`: Set to `"true"` to enable simulated streaming for non-streaming models (for testing). (Default: `"false"`)
+-   `FAKE_STREAMING_INTERVAL`: Interval in seconds for sending keep-alive messages during fake streaming. (Default: `1.0`)
 ## License