API 参考文档
OpenRouter 的请求和响应模式与 OpenAI Chat API 非常相似,只有一些小的差异。从高层次来看,OpenRouter 对跨模型和提供商的模式进行了标准化,因此您只需要学习一种即可。
基础 URL
所有 API 请求都应发送至以下基础 URL:
https://openrouter.co/v1
认证
每个 API 请求都需要在请求头中包含您的 API 密钥:
Authorization: Bearer sk-...
您可以在 OpenRouter 令牌 页面中找到或创建 API 令牌/密钥。
主要端点
OpenRouter API 包含以下主要端点:
文本生成
模型信息
- 获取可用模型列表 - 获取所有可用模型的信息
参数
OpenRouter 支持各种参数来控制模型行为和输出。详细信息请参阅参数部分。
错误处理
API 使用标准 HTTP 状态码指示请求结果:
200
- 请求成功400
- 请求参数无效或缺失401
- 认证失败(无效或过期的 API 密钥)402
- 账户余额不足404
- 无法找到请求的资源429
- 超过速率限制500
- 服务器内部错误
请求
输出生成请求格式
以下是请求模式的 TypeScript 类型定义。这将作为您向 /v1/chat/completions
端点发送 POST
请求的请求体(示例可参考上方的快速上手)。
如需完整的参数列表,请参阅参数部分。
// Definitions of subtypes are below
type Request = {
// Either "messages" or "prompt" is required
messages?: Message[];
prompt?: string;
// If "model" is unspecified, uses the user's default
model?: string; // See "Supported Models" section
// Allows to force the model to produce specific output format.
// See models page and note on this docs page for which models support it.
response_format?: { type: 'json_object' };
stop?: string | string[];
stream?: boolean; // Enable streaming
// See LLM Parameters (docs.openrouter.co/api-reference/parameters)
max_tokens?: number; // Range: [1, context_length)
temperature?: number; // Range: [0, 2]
// Tool calling
// Will be passed down as-is for providers implementing OpenAI's interface.
// For providers with custom interfaces, we transform and map the properties.
// Otherwise, we transform the tools into a YAML template. The model responds with an assistant message.
tools?: Tool[];
tool_choice?: ToolChoice;
// Advanced optional parameters
seed?: number; // Integer only
top_p?: number; // Range: (0, 1]
top_k?: number; // Range: [1, Infinity) Not available for OpenAI models
frequency_penalty?: number; // Range: [-2, 2]
presence_penalty?: number; // Range: [-2, 2]
repetition_penalty?: number; // Range: (0, 2]
logit_bias?: { [key: number]: number };
top_logprobs: number; // Integer only
min_p?: number; // Range: [0, 1]
top_a?: number; // Range: [0, 1]
// Reduce latency by providing the model with a predicted output
// https://platform.openai.com/docs/guides/latency-optimization#use-predicted-outputs
prediction?: { type: 'content'; content: string };
// OpenRouter-only parameters
// See "Prompt Transforms" section: docs.openrouter.co/message-transforms
transforms?: string[];
// See "Model Routing" section: docs.openrouter.co/model-routing
models?: string[];
route?: 'fallback';
// See "Provider Routing" section: docs.openrouter.co/provider-routing
provider?: ProviderPreferences;
};
// Subtypes:
type TextContent = {
type: 'text';
text: string;
};
type ImageContentPart = {
type: 'image_url';
image_url: {
url: string; // URL or base64 encoded image data
detail?: string; // Optional, defaults to "auto"
};
};
type ContentPart = TextContent | ImageContentPart;
type Message =
| {
role: 'user' | 'assistant' | 'system';
// ContentParts are only for the "user" role:
content: string | ContentPart[];
// If "name" is included, it will be prepended like this
// for non-OpenAI models: `{name}: {content}`
name?: string;
}
| {
role: 'tool';
content: string;
tool_call_id: string;
name?: string;
};
type FunctionDescription = {
description?: string;
name: string;
parameters: object; // JSON Schema object
};
type Tool = {
type: 'function';
function: FunctionDescription;
};
type ToolChoice =
| 'none'
| 'auto'
| {
type: 'function';
function: {
name: string;
};
};
response_format
参数确保您从大语言模型(LLM)获得结构化的响应。该参数仅由 OpenAI 模型、Nitro 模型和其他部分模型支持。
如果所选模型不支持某个请求参数(例如非OpenAI模型中的logit_bias
,或OpenAI中的top_k
),则该参数会被忽略。
其余参数会被转发到底层模型API。
预填充助理
OpenRouter 支持让模型完成部分响应。这可以用于引导模型以特定方式回答。
要使用此功能,只需在消息数组末尾包含一条 role: "assistant"
的消息即可。
fetch('https://openrouter.co/v1/chat/completions', {
method: 'POST',
headers: {
Authorization: 'Bearer <OPENROUTER_API_KEY>',
'Content-Type': 'application/json',
},
body: JSON.stringify({
model: 'openai/gpt-4o',
messages: [
{ role: 'user', content: 'What is the meaning of life?' },
{ role: 'assistant', content: "I'm not sure, but my best guess is" },
],
}),
});
图片与多模态
多模态请求仅可通过 /v1/chat/completions
API 实现,需使用多部分(multi-part)的 messages
参数。image_url
可以是 URL 或 base64 编码的图像数据。
"messages": [
{
"role": "user",
"content": [
{
"type": "text",
"text": "What's in this image?"
},
{
"type": "image_url",
"image_url": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"
}
}
]
}
]
LLM 响应示例:
{
"choices": [
{
"role": "assistant",
"content": "This image depicts a scenic natural landscape featuring a long wooden boardwalk that stretches out through an expansive field of green grass. The boardwalk provides a clear path and invites exploration through the lush environment. The scene is surrounded by a variety of shrubbery and trees in the background, indicating a diverse plant life in the area."
}
]
}
图片生成
某些模型支持原生图片生成功能。如需生成图片,您可以在请求中包含 modalities: ["image", "text"]
。模型将以 OpenAI ContentPartImage 格式返回图片,其中 image_url
包含一个 base64 数据 URL。
{
"model": "openai/dall-e-3",
"messages": [
{
"role": "user",
"content": "Create a beautiful sunset over mountains"
}
],
"modalities": ["image", "text"]
}
生成图片响应示例:
{
"choices": [
{
"message": {
"role": "assistant",
"content": [
{
"type": "text",
"text": "Here's your requested sunset over mountains."
},
{
"type": "image_url",
"image_url": {
"url": "data:image/png;base64,..."
}
}
]
}
}
]
}
上传Base64编码图片
对于本地存储的图片,您可以通过Base64编码将其发送给模型。以下是一个示例:
import { readFile } from "fs/promises";
const getFlowerImage = async (): Promise<string> => {
const imagePath = new URL("flower.jpg", import.meta.url);
const imageBuffer = await readFile(imagePath);
const base64Image = imageBuffer.toString("base64");
return `data:image/jpeg;base64,${base64Image}`;
};
...
"messages": [
{
role: "user",
content: [
{
type: "text",
text: "What's in this image?",
},
{
type: "image_url",
image_url: {
url: `${await getFlowerImage()}`,
},
},
],
},
];
在发送 base64 编码的数据字符串时,确保其中包含图像的 content-type。示例:

支持图片类型:
image/png
image/jpeg
image/webp
响应
CompletionsResponse 格式
OpenRouter 对模型和提供商的模式进行了标准化,以符合 OpenAI Chat API。
这意味着 choices
始终是一个数组,即使模型只返回一个补全结果。如果请求的是流式响应,每个选项将包含一个 delta
属性;否则,它将包含一个 message
属性。这使得为所有模型使用相同的代码变得更加容易。
以下是 TypeScript 类型的响应模式:
// Definitions of subtypes are below
type Response = {
id: string;
// Depending on whether you set "stream" to "true" and
// whether you passed in "messages" or a "prompt", you
// will get a different output shape
choices: (NonStreamingChoice | StreamingChoice | NonChatChoice)[];
created: number; // Unix timestamp
model: string;
object: 'chat.completion' | 'chat.completion.chunk';
system_fingerprint?: string; // Only present if the provider supports it
// Usage data is always returned for non-streaming.
// When streaming, you will get one usage object at
// the end accompanied by an empty choices array.
usage?: ResponseUsage;
};
// If the provider returns usage, we pass it down
// as-is. Otherwise, we count using the GPT-4 tokenizer.
type ResponseUsage = {
/** Including images and tools if any */
prompt_tokens: number;
/** The tokens generated */
completion_tokens: number;
/** Sum of the above two fields */
total_tokens: number;
};
// Subtypes:
type NonChatChoice = {
finish_reason: string | null;
text: string;
error?: ErrorResponse;
};
type NonStreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
message: {
content: string | null;
role: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};
type StreamingChoice = {
finish_reason: string | null;
native_finish_reason: string | null;
delta: {
content: string | null;
role?: string;
tool_calls?: ToolCall[];
};
error?: ErrorResponse;
};
type ErrorResponse = {
code: number; // See "Error Handling" section
message: string;
metadata?: Record<string, unknown>; // Contains additional error information such as provider details, the raw error message, etc.
};
type ToolCall = {
id: string;
type: 'function';
function: FunctionCall;
};
示例:
{
"id": "gen-xxxxxxxxxxxxxx",
"choices": [
{
"finish_reason": "stop", // Normalized finish_reason
"native_finish_reason": "stop", // The raw finish_reason from the provider
"message": {
// will be "delta" if streaming
"role": "assistant",
"content": "Hello there!"
}
}
],
"usage": {
"prompt_tokens": 0,
"completion_tokens": 4,
"total_tokens": 4
},
"model": "openai/gpt-3.5-turbo" // Could also be "anthropic/claude-2.1", etc, depending on the "model" that ends up being used
}
结束原因
OpenRouter 将每个模型的 finish_reason
标准化为以下值之一:tool_calls
(工具调用)、stop
(停止)、length
(长度限制)、content_filter
(内容过滤)、error
(错误)。
某些模型和提供商可能包含额外的完成原因。模型返回的原始 finish_reason
字符串可通过 native_finish_reason
属性获取。
查询成本与统计信息
在输出生成/补全API响应中返回的Token计数并非通过模型的原生分词器计算,而是使用一种标准化的、与模型无关的计数方式(通过GPT4o分词器实现)。这是因为某些提供商无法可靠地返回原生Token计数。不过,这种行为正变得越来越少见,未来我们可能会在响应对象中添加原生令牌计数。
余额的使用和模型定价基于原生令牌计数(而非API响应中返回的"标准化"Token计数)。