Tool-use design

This page provides: a complete JSON Schema example for the search_web tool, idempotency key generation and checking code, three error return formats (retryable / non-retryable / requires human review), tool timeout + retry + fallback implementation, and a bad vs good tool definition comparison.

Tools are the contract surface between models and real systems: schema is the type system; return bodies are observation signals. Good design reduces guessing, keeps servers in control, and explains failures; when aligning with MCP / OpenAPI, keep a single source of truth to avoid handwritten drift.

Anti-pattern: bad vs good tool definitions

Comparison of two versions, highlighting common mistakes and their fixes:

// ❌ Bad tool definition
{
  "type": "function",
  "function": {
    "name": "doStuff",               // meaningless name; unclear what it does
    "description": "Process data",   // description provides no value to the model
    "parameters": {
      "type": "object",
      "properties": {
        "data": { "type": "string" }, // no constraints; model doesn't know what to pass
        "mode": { "type": "string" }  // no enum; model will guess
      }
      // missing required; missing additionalProperties: false
    }
  }
}

// ✅ Good tool definition
{
  "type": "function",
  "function": {
    "name": "search_web",            // verb-first, semantically clear
    "description": "Search the internet for real-time information using a search engine. Read-only; does not modify any data. Use for: news, docs, fact verification. Do NOT use for: private internal document retrieval.",
    "parameters": {
      "type": "object",
      "additionalProperties": false,  // forbid extra fields
      "required": ["query"],          // explicitly declare required fields
      "properties": {
        "query": {
          "type": "string",
          "minLength": 2,
          "maxLength": 300,
          "description": "Search keywords, natural language or specific phrases; do not concatenate URLs or SQL"
        },
        "num_results": {
          "type": "integer",
          "minimum": 1,
          "maximum": 10,
          "default": 5,
          "description": "Number of results to return; default 5"
        },
        "date_restrict": {
          "type": ["string", "null"],
          "enum": ["d1", "w1", "m1", "m3", "m6", "y1", null],
          "default": null,
          "description": "Time range filter: d1=last 1 day, w1=last 1 week, m1=last 1 month, null=no limit"
        },
        "safe_search": {
          "type": "boolean",
          "default": true,
          "description": "Whether to enable safe search filtering; enabled by default"
        }
      }
    }
  }
}

Keep success and failure response shapes consistent (always include one of status, data, error).
Large results must paginate; field names stay fixed across SKILL and implementation (use next_cursor, never mix nextPage or page_token).
Do not return raw stacks or unsanitized internals to the model; log trace ids server-side; the message returned to the model should only contain user-understandable descriptions.

Tool contract and schema

The sample below mirrors common "function call" payloads: top-level name / description for model selection, parameters as JSON Schema (Draft-7 style). Validate on the server with the same schema.

{
  "type": "function",
  "function": {
    "name": "ticket_search",
    "description": "Search tickets by criteria; read-only; does not change state.",
    "parameters": {
      "type": "object",
      "additionalProperties": false,
      "required": ["query"],
      "properties": {
        "query": {
          "type": "string",
          "minLength": 1,
          "maxLength": 200,
          "description": "Keyword or ticket-number fragment; do not concatenate SQL."
        },
        "status": {
          "type": "string",
          "enum": ["open", "closed", "any"],
          "default": "open"
        },
        "page": {
          "type": "object",
          "additionalProperties": false,
          "properties": {
            "limit": { "type": "integer", "minimum": 1, "maximum": 50, "default": 20 },
            "cursor": { "type": ["string", "null"], "default": null }
          }
        }
      }
    }
  }
}

Parameter style comparison

Flat parameters suit a few scalars; nested objects suit clear domain boundaries and extensible config. Use the buttons below to switch examples and copy JSON in one click for docs or test bodies.

{
  "query": "login timeout",
  "status": "open",
  "limit": 20,
  "cursor": null
}

{
  "filter": {
    "text": "login timeout",
    "status": "open"
  },
  "page": {
    "limit": 20,
    "cursor": null
  }
}

Idempotency key design: generation and checking

Write-operation tools must support idempotency keys to prevent duplicate execution from network retries:

import hashlib, uuid, time, json
from functools import wraps

# Client side (model): generate idempotency key
def generate_idempotency_key(tool_name: str, args: dict) -> str:
    """Generate a deterministic idempotency key from tool name and args.
    Same operation produces the same key; used for server-side deduplication on retries."""
    content = f"{tool_name}:{sorted(args.items())}"
    return "idem_" + hashlib.sha256(content.encode()).hexdigest()[:32]

# Usage: attach idempotency key when calling create_ticket
key = generate_idempotency_key("create_ticket", {
    "title": "Fix login timeout",
    "priority": "high"
})
# key = "idem_a3f1b2c4d5e6f7a8b9c0d1e2f3a4b5c6"


# Server side: check and store idempotency key
import redis

r = redis.Redis(host="localhost", port=6379, db=0)
IDEM_KEY_TTL = 86400  # 24 hours

def with_idempotency(handler):
    """Decorator: check idempotency key; return cached result if already processed."""
    @wraps(handler)
    async def wrapper(args: dict, **kwargs):
        idem_key = args.get("idempotency_key")
        if idem_key:
            # Check if already processed
            cached = r.get(f"idem:{idem_key}")
            if cached:
                return {"status": "success", "data": json.loads(cached),
                        "_idempotent": True}  # flag as idempotent response

        result = await handler(args, **kwargs)

        # Store result on success
        if idem_key and result.get("status") == "success":
            r.setex(f"idem:{idem_key}", IDEM_KEY_TTL,
                    json.dumps(result["data"]))
        return result
    return wrapper

@with_idempotency
async def create_ticket(args: dict) -> dict:
    # Actual ticket creation logic
    ticket_id = f"tkt_{uuid.uuid4().hex[:8]}"
    return {"status": "success", "data": {"ticket_id": ticket_id}}

Three error formats + timeout retry fallback

Three standard error return formats; the agent uses retryable and human_review to decide the next action:

# Type 1: Non-retryable (requires parameter correction)
# Agent must correct parameters before retrying; do NOT retry indefinitely
{
  "status": "error",
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Invalid 'priority' value 'urgent'; allowed values: low/medium/high/critical",
    "fields": ["priority"],
    "retryable": false,
    "human_review": false
  }
}

# Type 2: Retryable (transient failure)
# Agent waits retry_after_ms then retries; max 3 retries
{
  "status": "error",
  "error": {
    "code": "UPSTREAM_TIMEOUT",
    "message": "Downstream database timed out (5000ms); please retry shortly",
    "retryable": true,
    "retry_after_ms": 3000,
    "max_retries": 3,
    "human_review": false
  }
}

# Type 3: Requires human review (high-risk operation)
# Agent stops auto-execution and notifies user for confirmation
{
  "status": "error",
  "error": {
    "code": "REQUIRES_HUMAN_APPROVAL",
    "message": "This operation will delete 2847 production database records and requires human confirmation",
    "retryable": false,
    "human_review": true,
    "approval_url": "https://app.example.com/approvals/apr_xyz123"
  }
}

Tool timeout + exponential backoff retry + fallback implementation:

import asyncio, time

async def call_tool_with_retry(
    tool_fn,
    args: dict,
    max_retries: int = 3,
    base_delay_ms: int = 1000,
    timeout_ms: int = 5000,
    fallback_fn=None,
) -> dict:
    """
    Tool call wrapper: timeout control + exponential backoff retry + fallback.
    - timeout_ms: per-call timeout
    - base_delay_ms: initial retry wait (doubles each attempt)
    - fallback_fn: degradation function called after all retries fail
    """
    last_error = None
    for attempt in range(max_retries + 1):
        try:
            result = await asyncio.wait_for(
                tool_fn(args),
                timeout=timeout_ms / 1000
            )
            if result.get("status") == "success":
                return result

            error = result.get("error", {})
            if not error.get("retryable", False):
                return result  # non-retryable error, return immediately

            last_error = result
        except asyncio.TimeoutError:
            last_error = {
                "status": "error",
                "error": {
                    "code": "CLIENT_TIMEOUT",
                    "message": f"Call timed out ({timeout_ms}ms)",
                    "retryable": True,
                    "attempt": attempt + 1,
                }
            }

        if attempt < max_retries:
            delay = (base_delay_ms * (2 ** attempt)) / 1000  # exponential backoff
            await asyncio.sleep(delay)

    # All retries failed; try fallback
    if fallback_fn:
        try:
            return await fallback_fn(args)
        except Exception as e:
            pass

    return last_error or {"status": "error",
                          "error": {"code": "MAX_RETRIES_EXCEEDED",
                                    "retryable": False}}

 Model issues tool_calls
        │
        ▼
 ┌──────────────┐
 │ Route / auth │─── Unauthorized ───► 401 + retryable:false ───► no retry, abort
 └──────┬───────┘
        │ OK
        ▼
 ┌──────────────┐
 │ Schema check │─── Bad args ───► 400 + fields[] ───► model corrects params and retries
 └──────┬───────┘
        │ valid
        ▼
 ┌──────────────┐
 │ Downstream   │─── Timeout/5xx ─► retryable:true + retry_after ───► exponential backoff
 └──────┬───────┘─── High-risk op ► human_review:true ───► pause, await human confirmation
        │ success
        ▼
  {status:"success", data:{...}} ──► next model turn

Governance and evaluation

Use golden call sequences (expected tool names and parameter snapshots) for regression; fuzz edge cases and adversarial strings. When schema changes, update SKILL constraints and record breaking changes.

SKILL review snippet

---
name: tool-use-schema-review
description: Review or design model-facing tool JSON Schema; input: tool definition draft; output: corrected definition + idempotency key code; prohibit: allowing models to pass unvalidated SQL or shell commands
version: "1.0.0"
triggers:
  - "design.*tool.*schema|tool.*JSON Schema"
  - "agent.*tool.*call|function calling"
steps:
  1. Check tool name: verb-first (search_/get_/create_/delete_), single semantic purpose
  2. description must state: read/write operation, when to use, when NOT to use
  3. All string params must have minLength/maxLength constraints
  4. Use enum field for enumerated values, not description text
  5. Set additionalProperties: false to prevent model from passing unknown fields
  6. Write-operation tools must have idempotency_key param (minLength: 16)
  7. High-risk tools (delete_/bulk_) must have environment param (staging/production)
  8. Implement server-side re-validation (assert or pydantic); do not rely on model self-constraint
  9. Implement call_tool_with_retry (timeout 5s, exponential backoff, max 3 retries)
  10. Error return must include code/message/retryable three fields
  11. Retryable errors add retry_after_ms; human review errors add human_review: true
  12. Do NOT expose internal stack traces in error message; log trace_id server-side
  13. Large results (>10KB) return resource_url; do not inline in response body
  14. SKILL must mandate "plan before invoke"; forbid chained entity-id guessing
  15. Validate tool schema against JSON Schema Draft 7 format in CI
constraints:
  - Do NOT execute model-provided strings directly (prevent SQL/command injection)
  - Do NOT expose internal paths, stack traces, or key-pattern strings in responses
  - Tool registry changes must update all SKILLs that reference the changed tool

Back to skills More skills