Benchmarks - Python SDK

Benchmarks method reference

The Python SDK and docs are currently in beta. Report issues on GitHub.

Overview

Benchmarks endpoints

Available Operations

get_benchmarks

Unified benchmark endpoint that aggregates scores from multiple benchmark sources (Artificial Analysis, Design Arena). Filter by source to reproduce the exact shapes from the legacy per-source endpoints, or use task_type to find models suited for specific workloads. Authenticate with any valid OpenRouter API key. Rate-limited to 30 requests/minute per key and 500 requests/day per account.

Example Usage

1from openrouter import OpenRouter
2import os
3
4with OpenRouter(
5 http_referer="<value>",
6 x_open_router_title="<value>",
7 x_open_router_categories="<value>",
8 api_key=os.getenv("OPENROUTER_API_KEY", ""),
9) as open_router:
10
11 res = open_router.benchmarks.get_benchmarks(source="artificial-analysis", max_results=20)
12
13 # Handle response
14 print(res)

Parameters

ParameterTypeRequiredDescriptionExample
sourceoperations.Source✔️Benchmark source to query. Determines the shape of the returned items.artificial-analysis
http_refererOptional[str]The app identifier should be your app’s URL and is used as the primary identifier for rankings.
This is used to track API usage per application.
x_open_router_titleOptional[str]The app display name allows you to customize how your app appears in OpenRouter’s dashboard.
x_open_router_categoriesOptional[str]Comma-separated list of app categories (e.g. “cli-agent,cloud-agent”). Used for marketplace rankings.
task_typeOptional[operations.TaskType]Filter results by task type. For Artificial Analysis, maps to the corresponding index. For Design Arena, maps to the matching category.coding
arenaOptional[operations.Arena]Design Arena only: arena to query. Defaults to models when source is design-arena.models
categoryOptional[str]Design Arena only: category within the arena (e.g. codecategories, uicomponent, gamedev, 3d, dataviz, image, video, svg). When omitted, returns all categories.codecategories
max_resultsOptional[int]Max results to return (1–100, default 50).20
retriesOptional[utils.RetryConfig]Configuration to override the default retry behavior of the client.

Response

components.UnifiedBenchmarksResponse

Errors

Error TypeStatus CodeContent Type
errors.BadRequestResponseError400application/json
errors.UnauthorizedResponseError401application/json
errors.TooManyRequestsResponseError429application/json
errors.InternalServerResponseError500application/json
errors.OpenRouterDefaultError4XX, 5XX*/*