工具调用设计

本页给出：search_web 工具的完整 JSON Schema 示例、幂等键生成与检查代码、三种错误返回格式（可重试/不可重试/需人工介入）、工具超时+重试+fallback 实现，以及糟糕 vs 好的工具定义对比。

工具是模型与真实系统之间的契约面：schema 即类型系统，返回体即观察信号。好的设计让模型少猜、服务端可控、失败可解释；与 MCP / OpenAPI 对齐时保持单一事实来源，避免多处手写漂移。

Anti-pattern：糟糕 vs 好的工具定义

对比两个版本，展示常见错误及修复：

// ❌ 糟糕的工具定义
{
  "type": "function",
  "function": {
    "name": "doStuff",               // 名称无意义，不知道做什么
    "description": "处理数据",        // description 对模型无用
    "parameters": {
      "type": "object",
      "properties": {
        "data": { "type": "string" }, // 无约束，模型不知道填什么
        "mode": { "type": "string" }  // 无 enum，模型会瞎猜
      }
      // 缺少 required，缺少 additionalProperties: false
    }
  }
}

// ✅ 好的工具定义
{
  "type": "function",
  "function": {
    "name": "search_web",            // 动词开头，语义清晰
    "description": "用搜索引擎检索实时网络信息。只读，不修改任何数据。适用于：获取新闻、查文档、验证事实。不适用于：私有内部文档检索。",
    "parameters": {
      "type": "object",
      "additionalProperties": false,  // 禁止额外字段
      "required": ["query"],          // 明确必填字段
      "properties": {
        "query": {
          "type": "string",
          "minLength": 2,
          "maxLength": 300,
          "description": "搜索关键词，自然语言或具体短语；禁止拼接 URL 或 SQL"
        },
        "num_results": {
          "type": "integer",
          "minimum": 1,
          "maximum": 10,
          "default": 5,
          "description": "返回结果条数，默认 5"
        },
        "date_restrict": {
          "type": ["string", "null"],
          "enum": ["d1", "w1", "m1", "m3", "m6", "y1", null],
          "default": null,
          "description": "时间范围过滤：d1=近1天, w1=近1周, m1=近1月, null=不限制"
        },
        "safe_search": {
          "type": "boolean",
          "default": true,
          "description": "是否开启安全搜索过滤，默认开启"
        }
      }
    }
  }
}

成功与失败的响应形状保持一致（始终含 status、data 或 error）。
大结果必须分页，字段名在 SKILL 与实现中固定（用 next_cursor，不用 nextPage 或 page_token 混用）。
禁止把内部堆栈或未脱敏信息直接回给模型；服务端记录 trace_id，返回给模型的 message 中只写用户可理解的说明。

工具契约与 Schema

以下示例贴近常见「函数调用」载荷：顶层 name / description 供模型选型，parameters 为 JSON Schema（Draft 7 风格）。实现侧应再用同一 schema 做校验。

{
  "type": "function",
  "function": {
    "name": "ticket_search",
    "description": "按条件检索工单，只读；不修改状态。",
    "parameters": {
      "type": "object",
      "additionalProperties": false,
      "required": ["query"],
      "properties": {
        "query": {
          "type": "string",
          "minLength": 1,
          "maxLength": 200,
          "description": "关键词或工单号片段；勿拼接 SQL。"
        },
        "status": {
          "type": "string",
          "enum": ["open", "closed", "any"],
          "default": "open"
        },
        "page": {
          "type": "object",
          "additionalProperties": false,
          "properties": {
            "limit": { "type": "integer", "minimum": 1, "maximum": 50, "default": 20 },
            "cursor": { "type": ["string", "null"], "default": null }
          }
        }
      }
    }
  }
}

参数风格对照

扁平参数适合少量标量；嵌套对象适合领域边界清晰、可扩展的配置。用下方按钮切换示例并一键复制，便于贴进文档或测试请求体。

{
  "query": "login timeout",
  "status": "open",
  "limit": 20,
  "cursor": null
}

{
  "filter": {
    "text": "login timeout",
    "status": "open"
  },
  "page": {
    "limit": 20,
    "cursor": null
  }
}

幂等键设计：生成与检查

写操作工具必须支持幂等键，防止网络重试导致重复执行：

import hashlib, uuid, time
from functools import wraps

# 客户端（模型侧）：生成幂等键
def generate_idempotency_key(tool_name: str, args: dict) -> str:
    """基于工具名和参数生成确定性幂等键。相同操作产生相同 key，
    用于网络重试时服务端去重。"""
    content = f"{tool_name}:{sorted(args.items())}"
    return "idem_" + hashlib.sha256(content.encode()).hexdigest()[:32]

# 使用示例：调用 create_ticket 时附上幂等键
key = generate_idempotency_key("create_ticket", {
    "title": "修复登录超时",
    "priority": "high"
})
# key = "idem_a3f1b2c4d5e6f7a8b9c0d1e2f3a4b5c6"


# 服务端：检查和存储幂等键
import redis

r = redis.Redis(host="localhost", port=6379, db=0)
IDEM_KEY_TTL = 86400  # 24 小时

def with_idempotency(handler):
    """装饰器：检查幂等键，已处理则直接返回缓存结果。"""
    @wraps(handler)
    async def wrapper(args: dict, **kwargs):
        idem_key = args.get("idempotency_key")
        if idem_key:
            # 检查是否已处理
            cached = r.get(f"idem:{idem_key}")
            if cached:
                return {"status": "success", "data": json.loads(cached),
                        "_idempotent": True}  # 标记为幂等返回

        result = await handler(args, **kwargs)

        # 成功后存储结果
        if idem_key and result.get("status") == "success":
            r.setex(f"idem:{idem_key}", IDEM_KEY_TTL,
                    json.dumps(result["data"]))
        return result
    return wrapper

@with_idempotency
async def create_ticket(args: dict) -> dict:
    # 实际创建工单逻辑
    ticket_id = f"tkt_{uuid.uuid4().hex[:8]}"
    return {"status": "success", "data": {"ticket_id": ticket_id}}

三种错误格式 + 超时重试 fallback

三种标准错误返回格式，Agent 根据 retryable 和 human_review 决定下一步动作：

# 类型 1：不可重试（需修正参数）
# Agent 收到后修改参数再调用，不要无限重试
{
  "status": "error",
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "priority 值 'urgent' 无效，允许值：low/medium/high/critical",
    "fields": ["priority"],
    "retryable": false,
    "human_review": false
  }
}

# 类型 2：可重试（临时故障）
# Agent 等待 retry_after_ms 后重试，最多重试 3 次
{
  "status": "error",
  "error": {
    "code": "UPSTREAM_TIMEOUT",
    "message": "下游数据库响应超时（5000ms），请稍后重试",
    "retryable": true,
    "retry_after_ms": 3000,
    "max_retries": 3,
    "human_review": false
  }
}

# 类型 3：需人工介入（高风险操作）
# Agent 停止自动执行，通知用户确认
{
  "status": "error",
  "error": {
    "code": "REQUIRES_HUMAN_APPROVAL",
    "message": "此操作将删除 2847 条生产数据库记录，需要人工确认",
    "retryable": false,
    "human_review": true,
    "approval_url": "https://app.example.com/approvals/apr_xyz123"
  }
}

工具超时 + 指数退避重试 + fallback 实现：

import asyncio, time

async def call_tool_with_retry(
    tool_fn,
    args: dict,
    max_retries: int = 3,
    base_delay_ms: int = 1000,
    timeout_ms: int = 5000,
    fallback_fn=None,
) -> dict:
    """
    工具调用封装：超时控制 + 指数退避重试 + fallback。
    - timeout_ms: 单次调用超时
    - base_delay_ms: 首次重试等待时间（每次翻倍）
    - fallback_fn: 所有重试失败后的降级函数
    """
    last_error = None
    for attempt in range(max_retries + 1):
        try:
            result = await asyncio.wait_for(
                tool_fn(args),
                timeout=timeout_ms / 1000
            )
            if result.get("status") == "success":
                return result

            error = result.get("error", {})
            if not error.get("retryable", False):
                return result  # 不可重试错误，立即返回

            last_error = result
        except asyncio.TimeoutError:
            last_error = {
                "status": "error",
                "error": {
                    "code": "CLIENT_TIMEOUT",
                    "message": f"调用超时（{timeout_ms}ms）",
                    "retryable": True,
                    "attempt": attempt + 1,
                }
            }

        if attempt < max_retries:
            delay = (base_delay_ms * (2 ** attempt)) / 1000  # 指数退避
            await asyncio.sleep(delay)

    # 所有重试失败，尝试 fallback
    if fallback_fn:
        try:
            return await fallback_fn(args)
        except Exception as e:
            pass

    return last_error or {"status": "error",
                          "error": {"code": "MAX_RETRIES_EXCEEDED",
                                    "retryable": False}}

 模型发起 tool_calls
        │
        ▼
 ┌──────────────┐
 │ 路由 / 鉴权   │─── 未授权 ───► 401 + retryable:false ───► 不重试，中止
 └──────┬───────┘
        │ 通过
        ▼
 ┌──────────────┐
 │ Schema 校验   │─── 参数非法 ───► 400 + fields[] ───► 模型修正参数后再调用
 └──────┬───────┘
        │ 合法
        ▼
 ┌──────────────┐
 │ 执行下游依赖  │─── 超时/5xx ─► retryable:true + retry_after ───► 指数退避重试
 └──────┬───────┘─── 高风险操作 ► human_review:true ───► 暂停等待人工确认
        │ 成功
        ▼
  {status:"success", data:{...}} ──► 进入模型下一轮推理

治理与评测

用黄金调用序列（期望的工具名与参数快照）做回归；对边界值与对抗性字符串做 fuzz。变更 schema 时同步更新 SKILL 中的约束说明，并记录破坏性变更。

SKILL 审查片段

---
name: tool-use-schema-review
description: 审查或设计面向模型的工具 JSON Schema；输入：工具定义草稿；产出：修正后的定义 + 幂等键代码；禁止：允许模型传入未校验的 SQL 或 shell 命令
version: "1.0.0"
triggers:
  - "设计工具.*schema|工具.*JSON Schema"
  - "Agent 调用.*工具|function calling"
steps:
  1. 检查工具名：动词开头（search_/get_/create_/delete_），单一语义
  2. description 必须说明：只读/写操作、适用场景、不适用场景
  3. 所有字符串参数必须有 minLength/maxLength 约束
  4. 枚举值用 enum 字段而非 description 中描述
  5. 设置 additionalProperties: false 防止模型传入未知字段
  6. 写操作工具必须有 idempotency_key 参数（minLength: 16）
  7. 高风险工具（delete_/bulk_）必须有 environment 参数（staging/production）
  8. 实现服务端二次校验（assert 或 pydantic），不依赖模型自我约束
  9. 实现 call_tool_with_retry（超时 5s，指数退避，最多重试 3 次）
  10. 错误返回必须包含 code/message/retryable 三个字段
  11. 可重试错误加 retry_after_ms；人工介入错误加 human_review: true
  12. 禁止在错误 message 中暴露内部堆栈；记录 trace_id 供服务端排查
  13. 大结果（超 10KB）返回 resource_url，不内联在响应体
  14. 在 SKILL 中约定先 plan 再 invoke，禁止链式猜测实体 id
  15. CI 中对工具 schema 做 JSON Schema Draft 7 格式校验
constraints:
  - 禁止工具直接执行模型传入的字符串（防 SQL/命令注入）
  - 禁止在返回体中暴露内部路径、堆栈或密钥模式字符串
  - 工具注册表变更必须同步更新引用该工具的 SKILL

返回技能库更多技能入口