Spaces:
Running
on
CPU Upgrade
Backend - Open LLM Leaderboard π
FastAPI backend for the Open LLM Leaderboard. This service is part of a larger architecture that includes a React frontend. For complete project installation, see the main README.
β¨ Features
- π REST API for LLM models leaderboard management
- π³οΈ Voting and ranking system
- π HuggingFace Hub integration
- π Caching and performance optimizations
π Architecture
flowchart TD
Client(["**Frontend**<br><br>React Application"]) --> API["**API Server**<br><br>FastAPI REST Endpoints"]
subgraph Backend
API --> Core["**Core Layer**<br><br>β’ Middleware<br>β’ Cache<br>β’ Rate Limiting"]
Core --> Services["**Services Layer**<br><br>β’ Business Logic<br>β’ Data Processing"]
subgraph Services Layer
Services --> Models["**Model Service**<br><br>β’ Model Submission<br>β’ Evaluation Pipeline"]
Services --> Votes["**Vote Service**<br><br>β’ Vote Management<br>β’ Data Synchronization"]
Services --> Board["**Leaderboard Service**<br><br>β’ Rankings<br>β’ Performance Metrics"]
end
Models --> Cache["**Cache Layer**<br><br>β’ In-Memory Store<br>β’ Auto Invalidation"]
Votes --> Cache
Board --> Cache
Models --> HF["**HuggingFace Hub**<br><br>β’ Models Repository<br>β’ Datasets Access"]
Votes --> HF
Board --> HF
end
style Client fill:#f9f,stroke:#333,stroke-width:2px
style Models fill:#bbf,stroke:#333,stroke-width:2px
style Votes fill:#bbf,stroke:#333,stroke-width:2px
style Board fill:#bbf,stroke:#333,stroke-width:2px
style HF fill:#bfb,stroke:#333,stroke-width:2px
π οΈ HuggingFace Datasets
The application uses several datasets on the HuggingFace Hub:
1. Requests Dataset ({HF_ORGANIZATION}/requests
)
- Operations:
- π€
POST /api/models/submit
: Adds a JSON file for each new model submission - π₯
GET /api/models/status
: Reads files to get models status
- π€
- Format: One JSON file per model with submission details
- Updates: On each new model submission
2. Votes Dataset ({HF_ORGANIZATION}/votes
)
- Operations:
- π€
POST /api/votes/{model_id}
: Adds a new vote - π₯
GET /api/votes/model/{provider}/{model}
: Reads model votes - π₯
GET /api/votes/user/{user_id}
: Reads user votes
- π€
- Format: JSONL with one vote per line
- Sync: Bidirectional between local cache and Hub
3. Contents Dataset ({HF_ORGANIZATION}/contents
)
- Operations:
- π₯
GET /api/leaderboard
: Reads raw data - π₯
GET /api/leaderboard/formatted
: Reads and formats data
- π₯
- Format: Main dataset containing all scores and metrics
- Updates: Automatic after model evaluations
4. Maintainers Highlight Dataset ({HF_ORGANIZATION}/maintainers-highlight
)
- Operations:
- π₯ Read-only access for highlighted models
- Format: List of models selected by maintainers
- Updates: Manual by maintainers
π Local Development
Prerequisites
- Python 3.9+
- Poetry
Standalone Installation (without Docker)
# Install dependencies
poetry install
# Setup configuration
cp .env.example .env
# Start development server
poetry run uvicorn app.asgi:app --host 0.0.0.0 --port 7860 --reload
Server will be available at http://localhost:7860
βοΈ Configuration
Variable | Description | Default |
---|---|---|
ENVIRONMENT | Environment (development/production) | development |
HF_TOKEN | HuggingFace authentication token | - |
PORT | Server port | 7860 |
LOG_LEVEL | Logging level (INFO/DEBUG/WARNING) | INFO |
CORS_ORIGINS | Allowed CORS origins | ["*"] |
CACHE_TTL | Cache Time To Live in seconds | 300 |
π§ Middleware
The backend uses several middleware layers for optimal performance and security:
- CORS Middleware: Handles Cross-Origin Resource Sharing
- GZIP Middleware: Compresses responses > 500 bytes
- Rate Limiting: Prevents API abuse
- Caching: In-memory caching with automatic invalidation
π Logging
The application uses a structured logging system with:
- Formatted console output
- Different log levels per component
- Request/Response logging
- Performance metrics
- Error tracking
π File Structure
backend/
βββ app/ # Source code
β βββ api/ # Routes and endpoints
β β βββ endpoints/ # Endpoint handlers
β βββ core/ # Configurations
β βββ services/ # Business logic
β βββ utils/ # Utilities
βββ tests/ # Tests
π API
Swagger documentation available at http://localhost:7860/docs
Main Endpoints & Data Structures
Leaderboard
GET /api/leaderboard/formatted
- Formatted data with computed fields and metadataResponse { models: [{ id: string, // eval_name model: { name: string, // fullname sha: string, // Model sha precision: string, // e.g. "fp16", "int8" type: string, // e.g. "fined-tuned-on-domain-specific-dataset" weight_type: string, architecture: string, average_score: number, has_chat_template: boolean }, evaluations: { ifeval: { name: "IFEval", value: number, // Raw score normalized_score: number }, bbh: { name: "BBH", value: number, normalized_score: number }, math: { name: "MATH Level 5", value: number, normalized_score: number }, gpqa: { name: "GPQA", value: number, normalized_score: number }, musr: { name: "MUSR", value: number, normalized_score: number }, mmlu_pro: { name: "MMLU-PRO", value: number, normalized_score: number } }, features: { is_not_available_on_hub: boolean, is_merged: boolean, is_moe: boolean, is_flagged: boolean, is_highlighted_by_maintainer: boolean }, metadata: { upload_date: string, submission_date: string, generation: string, base_model: string, hub_license: string, hub_hearts: number, params_billions: number, co2_cost: number // COβ cost in kg } }] }
GET /api/leaderboard
- Raw data from the HuggingFace datasetResponse { models: [{ eval_name: string, Precision: string, Type: string, "Weight type": string, Architecture: string, Model: string, fullname: string, "Model sha": string, "Average β¬οΈ": number, "Hub License": string, "Hub β€οΈ": number, "#Params (B)": number, "Available on the hub": boolean, Merged: boolean, MoE: boolean, Flagged: boolean, "Chat Template": boolean, "COβ cost (kg)": number, "IFEval Raw": number, IFEval: number, "BBH Raw": number, BBH: number, "MATH Lvl 5 Raw": number, "MATH Lvl 5": number, "GPQA Raw": number, GPQA: number, "MUSR Raw": number, MUSR: number, "MMLU-PRO Raw": number, "MMLU-PRO": number, "Maintainer's Highlight": boolean, "Upload To Hub Date": string, "Submission Date": string, Generation: string, "Base Model": string }] }
Models
GET /api/models/status
- Get all models grouped by statusResponse { pending: [{ name: string, submitter: string, revision: string, wait_time: string, submission_time: string, status: "PENDING" | "EVALUATING" | "FINISHED", precision: string }], evaluating: Array<Model>, finished: Array<Model> }
GET /api/models/pending
- Get pending models onlyPOST /api/models/submit
- Submit modelRequest { user_id: string, model_id: string, base_model?: string, precision?: string, model_type: string } Response { status: string, message: string }
GET /api/models/{model_id}/status
- Get model status
Votes
POST /api/votes/{model_id}
- VoteRequest { vote_type: "up" | "down", user_id: string // HuggingFace username } Response { success: boolean, message: string }
GET /api/votes/model/{provider}/{model}
- Get model votesResponse { total_votes: number, up_votes: number, down_votes: number }
GET /api/votes/user/{user_id}
- Get user votesResponse Array<{ model_id: string, vote_type: string, timestamp: string }>
π Authentication
The backend uses HuggingFace token-based authentication for secure API access. Make sure to:
- Set your HF_TOKEN in the .env file
- Include the token in API requests via Bearer authentication
- Keep your token secure and never commit it to version control
π Performance
The backend implements several optimizations:
- In-memory caching with configurable TTL (Time To Live)
- Batch processing for model evaluations
- Rate limiting for API endpoints
- Efficient database queries with proper indexing
- Automatic cache invalidation for votes