创建 Responses

POST

/v1/responses

基于给定输入从大语言模型生成响应。Responses API 提供了统一的接口，支持文本生成、多轮续接、工具调用、推理配置和可选的流式输出。

授权

发起 REST API 请求时，必须在请求头中包含 AccessToken 以及 Content-Type 头。您可以使用以下格式进行授权：

--header 'Authorization: Bearer <your_token_here>'
--header 'Content-Type: application/json'

注意：请将 your_token_here 替换为您的实际 AccessToken。它包含允许服务器验证您的身份和权限的信息。您可以在此处创建 API 密钥。

请求体

字段	类型	必填	描述
`background`	boolean, null	否	是否在后台运行模型响应。
`include`	string[]	否	要在响应中包含的附加输出数据数组，例如 `file_search_call.results`、`message.output_text.logprobs`、`web_search_call.action.sources` 或 `reasoning.encrypted_content`。
`model`	string	是	用于生成响应的模型名称
`input`	string, string[]	是	用于生成响应的输入，可以是单个字符串或字符串数组
`instructions`	string	否	可选指令，用于指导模型的响应生成，提供特定的指示或输出约束。
`stream`	boolean, null	否	是否在生成时以流式方式返回响应（默认：false）
`max_output_tokens`	integer, null	否	响应中生成的最大 Token 数
`max_tool_calls`	integer, null	否	响应生成期间允许的最大工具调用次数，有助于控制外部工具使用的程度。
`temperature`	number, null	否	生成响应时使用的采样温度
`top_p`	number, null	否	生成响应时使用的核采样概率，有助于在生成的响应中平衡随机性和连贯性。
`text`	object	否	文本输出的配置选项，包括通过 `text.format` 的纯文本和结构化 JSON 输出。
`parallel_tool_calls`	boolean, null	否	响应生成期间是否允许并行工具调用，支持同时调用多个工具以提高处理效率。
`previous_response_id`	string, null	否	用作上下文以生成新响应的先前响应 ID，允许基于先前交互进行续接对话或后续响应。
`metadata`	object	否	包含要包含在请求中的附加元数据的对象，可用于跟踪、日志记录或为响应生成提供额外上下文等用途。
`reasoning`	object	否	包含推理模型配置的对象，可用于在响应生成期间启用或自定义模型的推理能力。
`store`	boolean, null	否	是否将生成的响应存储在系统中以供将来参考或分析（默认：false）
`tool_choice`	string, null	否	响应生成期间模型需要在多个工具之间进行选择时使用的策略，例如 `auto`、`none` 或 `required`。
`tools`	array, null	否	模型在响应生成期间可以调用的工具定义数组，允许增强功能和与外部系统集成。每个工具定义包括工具名称、描述、参数和其他相关信息，以指导模型有效使用工具。
`truncation`	string, null	否	模型响应使用的截断策略。支持值为 `auto` 和 `disabled`（默认）。

工具配置

目前，API 仅支持函数工具。tools 数组中的每个对象应具有以下结构：

字段	类型	必填	描述
`type`	string	是	工具类型。目前支持 `function`。
`name`	string	是	模型可以调用的函数名称。
`description`	string, null	否	函数功能的描述，供模型决定何时及如何调用。
`parameters`	object	是	定义函数参数的 JSON Schema 对象。
`strict`	boolean, null	否	模型调用函数时是否应严格遵循提供的参数 schema。

parameters 的 JSON Schema 示例：

{
  "type": "object",
  "properties": {
    "location": {
      "type": "string",
      "description": "The city and country, for example: Singapore, Singapore"
    }
  },
  "required": ["location"],
  "additionalProperties": false
}

推理配置

字段	类型	必填	描述
`effort`	string	否	支持推理的模型使用的推理努力级别，例如 `minimal`、`low`、`medium` 或 `high`。
`summary`	string, null	否	控制所选模型支持时是否生成推理摘要。

文本配置

text 对象配置文本输出格式和详细程度。

字段	类型	必填	描述
`format`	object	否	响应格式。默认为 `{ "type": "text" }`。使用 `{ "type": "json_schema" }` 进行结构化输出，或 `{ "type": "json_object" }` 进入 JSON 模式。
`verbosity`	string, null	否	控制输出的详细程度。支持值为 `low`、`medium` 和 `high`。

format 对象应具有以下结构：

字段	类型	必填	描述
`type`	string	是	格式类型。支持值为 `text`、`json_object` 和 `json_schema`。
`name`	string	否	JSON Schema 响应格式的名称。当 `type` 为 `json_schema` 时必填。
`description`	string	否	响应格式的描述，供模型决定如何响应。
`schema`	object	否	模型输出必须遵循的 JSON Schema 对象。当 `type` 为 `json_schema` 时必填。
`strict`	boolean, null	否	生成结构化输出时是否启用严格的 schema 遵循。

Include 配置

include 数组控制响应中返回的可选字段。常用值包括：

值	描述
`file_search_call.results`	包含文件搜索结果。
`message.output_text.logprobs`	包含输出文本的对数概率。
`web_search_call.action.sources`	包含网络搜索工具调用的来源。
`reasoning.encrypted_content`	包含支持时的加密推理内容。

请求示例

{
  "model": "minimax/minimax-m2.5",
  "input": "Explain the concept of a polymer in simple terms.",
  "instructions": "Answer clearly and concisely.",
  "stream": false,
  "max_output_tokens": 100,
  "temperature": 0.7,
  "top_p": 0.9,
  "text": {
    "format": {
      "type": "text"
    }
  },
  "parallel_tool_calls": true,
  "store": false,
  "truncation": "disabled"
}

响应

成功响应

字段	类型	描述
`id`	string	响应的唯一标识符
`object`	string	返回的对象类型，此端点为 `response`
`created_at`	integer	响应创建时的时间戳（自 Unix 纪元以来的秒数）
`status`	string	响应的状态，例如 `completed`、`in_progress`、`failed` 或 `incomplete`
`completed_at`	integer, null	响应完成时的时间戳（自 Unix 纪元以来的秒数），如果可用。
`model`	string	用于生成响应的模型名称
`output`	array	模型生成的输出项数组，例如消息、推理项和工具调用。
`output_text`	string	SDK 便利字段，包含可用时的聚合生成文本。不包含在原始 REST 响应体中。
`error`	object, null	如果响应失败，则为错误详情。
`incomplete_details`	object, null	解释响应不完整原因的详情（如适用）。
`instructions`	string, null	此响应使用的指令。
`max_output_tokens`	integer, null	为响应配置的最大输出 Token 数。
`max_tool_calls`	integer, null	为响应配置的最大工具调用次数。
`parallel_tool_calls`	boolean	响应是否启用了并行工具调用。
`previous_response_id`	string, null	用作上下文的先前响应 ID（如已提供）。
`reasoning`	object	响应使用的推理配置。
`store`	boolean	响应是否被存储。
`temperature`	number, null	生成时使用的采样温度。
`text`	object	生成时使用的文本配置。
`tool_choice`	string, object	生成时使用的工具选择策略。
`tools`	array	模型可用的工具定义。
`top_p`	number, null	生成时使用的核采样概率。
`truncation`	string, null	响应使用的截断策略。
`usage`	object	请求和响应的 Token 使用量。
`metadata`	object	与响应关联的附加元数据。

创建 Responses

curl 'https://api.luchentech.com/inference/v1/responses'   -H 'Content-Type: application/json'   -H 'Authorization: Bearer <your_token_here>'   -d '{    "model": "minimax/minimax-m2.5",    "input": "Explain the concept of a polymer in simple terms.",    "instructions": "Answer clearly and concisely.",    "max_output_tokens": 100,    "temperature": 0.7,    "top_p": 0.9,    "text": {      "format": { "type": "text" }    },    "stream": false,    "store": false,    "truncation": "disabled"  }'

{  "id": "resp_e63095aef9bc4d7292b769edb2cb6583",  "object": "response",  "created_at": 1773651537,  "status": "completed",  "completed_at": 1773651538,  "model": "minimax/minimax-m2.5",  "output": [    {      "type": "message",      "id": "msg_001",      "status": "completed",      "role": "assistant",      "content": [        {          "type": "output_text",          "text": "A polymer is a large molecule made by linking many smaller repeating units together, like beads on a string. Plastics, rubber, and DNA are all examples of polymers.",          "annotations": []        }      ]    }  ],  "error": null,  "incomplete_details": null,  "instructions": "Answer clearly and concisely.",  "max_output_tokens": 100,  "max_tool_calls": null,  "parallel_tool_calls": true,  "previous_response_id": null,  "reasoning": {    "effort": null,    "summary": null  },  "store": false,  "temperature": 0.7,  "text": {    "format": {      "type": "text"    }  },  "tool_choice": "auto",  "tools": [],  "top_p": 0.9,  "truncation": "disabled",  "usage": {    "input_tokens": 15,    "input_tokens_details": {      "cached_tokens": 0    },    "output_tokens": 34,    "output_tokens_details": {      "reasoning_tokens": 0    },    "total_tokens": 49  },  "user": null,  "metadata": {}}

授权​

请求体​

工具配置​

推理配置​

文本配置​

Include 配置​

请求示例​

响应​

成功响应​

授权

请求体