Tool-use design
This page provides: a complete JSON Schema example for the search_web tool, idempotency key generation and checking code, three error return formats (retryable / non-retryable / requires human review), tool timeout + retry + fallback implementation, and a bad vs good tool definition comparison.
Tools are the contract surface between models and real systems: schema is the type system; return bodies are observation signals. Good design reduces guessing, keeps servers in control, and explains failures; when aligning with MCP / OpenAPI, keep a single source of truth to avoid handwritten drift.
Anti-pattern: bad vs good tool definitions
Comparison of two versions, highlighting common mistakes and their fixes:
// ❌ Bad tool definition
{
"type": "function",
"function": {
"name": "doStuff", // meaningless name; unclear what it does
"description": "Process data", // description provides no value to the model
"parameters": {
"type": "object",
"properties": {
"data": { "type": "string" }, // no constraints; model doesn't know what to pass
"mode": { "type": "string" } // no enum; model will guess
}
// missing required; missing additionalProperties: false
}
}
}
// ✅ Good tool definition
{
"type": "function",
"function": {
"name": "search_web", // verb-first, semantically clear
"description": "Search the internet for real-time information using a search engine. Read-only; does not modify any data. Use for: news, docs, fact verification. Do NOT use for: private internal document retrieval.",
"parameters": {
"type": "object",
"additionalProperties": false, // forbid extra fields
"required": ["query"], // explicitly declare required fields
"properties": {
"query": {
"type": "string",
"minLength": 2,
"maxLength": 300,
"description": "Search keywords, natural language or specific phrases; do not concatenate URLs or SQL"
},
"num_results": {
"type": "integer",
"minimum": 1,
"maximum": 10,
"default": 5,
"description": "Number of results to return; default 5"
},
"date_restrict": {
"type": ["string", "null"],
"enum": ["d1", "w1", "m1", "m3", "m6", "y1", null],
"default": null,
"description": "Time range filter: d1=last 1 day, w1=last 1 week, m1=last 1 month, null=no limit"
},
"safe_search": {
"type": "boolean",
"default": true,
"description": "Whether to enable safe search filtering; enabled by default"
}
}
}
}
}- Keep success and failure response shapes consistent (always include one of
status,data,error). - Large results must paginate; field names stay fixed across SKILL and implementation (use
next_cursor, never mixnextPageorpage_token). - Do not return raw stacks or unsanitized internals to the model; log trace ids server-side; the message returned to the model should only contain user-understandable descriptions.
Tool contract and schema
The sample below mirrors common "function call" payloads: top-level name / description for model selection, parameters as JSON Schema (Draft-7 style). Validate on the server with the same schema.
{
"type": "function",
"function": {
"name": "ticket_search",
"description": "Search tickets by criteria; read-only; does not change state.",
"parameters": {
"type": "object",
"additionalProperties": false,
"required": ["query"],
"properties": {
"query": {
"type": "string",
"minLength": 1,
"maxLength": 200,
"description": "Keyword or ticket-number fragment; do not concatenate SQL."
},
"status": {
"type": "string",
"enum": ["open", "closed", "any"],
"default": "open"
},
"page": {
"type": "object",
"additionalProperties": false,
"properties": {
"limit": { "type": "integer", "minimum": 1, "maximum": 50, "default": 20 },
"cursor": { "type": ["string", "null"], "default": null }
}
}
}
}
}
}Parameter style comparison
Flat parameters suit a few scalars; nested objects suit clear domain boundaries and extensible config. Use the buttons below to switch examples and copy JSON in one click for docs or test bodies.
{
"query": "login timeout",
"status": "open",
"limit": 20,
"cursor": null
}{
"filter": {
"text": "login timeout",
"status": "open"
},
"page": {
"limit": 20,
"cursor": null
}
}Idempotency key design: generation and checking
Write-operation tools must support idempotency keys to prevent duplicate execution from network retries:
import hashlib, uuid, time, json
from functools import wraps
# Client side (model): generate idempotency key
def generate_idempotency_key(tool_name: str, args: dict) -> str:
"""Generate a deterministic idempotency key from tool name and args.
Same operation produces the same key; used for server-side deduplication on retries."""
content = f"{tool_name}:{sorted(args.items())}"
return "idem_" + hashlib.sha256(content.encode()).hexdigest()[:32]
# Usage: attach idempotency key when calling create_ticket
key = generate_idempotency_key("create_ticket", {
"title": "Fix login timeout",
"priority": "high"
})
# key = "idem_a3f1b2c4d5e6f7a8b9c0d1e2f3a4b5c6"
# Server side: check and store idempotency key
import redis
r = redis.Redis(host="localhost", port=6379, db=0)
IDEM_KEY_TTL = 86400 # 24 hours
def with_idempotency(handler):
"""Decorator: check idempotency key; return cached result if already processed."""
@wraps(handler)
async def wrapper(args: dict, **kwargs):
idem_key = args.get("idempotency_key")
if idem_key:
# Check if already processed
cached = r.get(f"idem:{idem_key}")
if cached:
return {"status": "success", "data": json.loads(cached),
"_idempotent": True} # flag as idempotent response
result = await handler(args, **kwargs)
# Store result on success
if idem_key and result.get("status") == "success":
r.setex(f"idem:{idem_key}", IDEM_KEY_TTL,
json.dumps(result["data"]))
return result
return wrapper
@with_idempotency
async def create_ticket(args: dict) -> dict:
# Actual ticket creation logic
ticket_id = f"tkt_{uuid.uuid4().hex[:8]}"
return {"status": "success", "data": {"ticket_id": ticket_id}}Three error formats + timeout retry fallback
Three standard error return formats; the agent uses retryable and human_review to decide the next action:
# Type 1: Non-retryable (requires parameter correction)
# Agent must correct parameters before retrying; do NOT retry indefinitely
{
"status": "error",
"error": {
"code": "VALIDATION_ERROR",
"message": "Invalid 'priority' value 'urgent'; allowed values: low/medium/high/critical",
"fields": ["priority"],
"retryable": false,
"human_review": false
}
}
# Type 2: Retryable (transient failure)
# Agent waits retry_after_ms then retries; max 3 retries
{
"status": "error",
"error": {
"code": "UPSTREAM_TIMEOUT",
"message": "Downstream database timed out (5000ms); please retry shortly",
"retryable": true,
"retry_after_ms": 3000,
"max_retries": 3,
"human_review": false
}
}
# Type 3: Requires human review (high-risk operation)
# Agent stops auto-execution and notifies user for confirmation
{
"status": "error",
"error": {
"code": "REQUIRES_HUMAN_APPROVAL",
"message": "This operation will delete 2847 production database records and requires human confirmation",
"retryable": false,
"human_review": true,
"approval_url": "https://app.example.com/approvals/apr_xyz123"
}
}Tool timeout + exponential backoff retry + fallback implementation:
import asyncio, time
async def call_tool_with_retry(
tool_fn,
args: dict,
max_retries: int = 3,
base_delay_ms: int = 1000,
timeout_ms: int = 5000,
fallback_fn=None,
) -> dict:
"""
Tool call wrapper: timeout control + exponential backoff retry + fallback.
- timeout_ms: per-call timeout
- base_delay_ms: initial retry wait (doubles each attempt)
- fallback_fn: degradation function called after all retries fail
"""
last_error = None
for attempt in range(max_retries + 1):
try:
result = await asyncio.wait_for(
tool_fn(args),
timeout=timeout_ms / 1000
)
if result.get("status") == "success":
return result
error = result.get("error", {})
if not error.get("retryable", False):
return result # non-retryable error, return immediately
last_error = result
except asyncio.TimeoutError:
last_error = {
"status": "error",
"error": {
"code": "CLIENT_TIMEOUT",
"message": f"Call timed out ({timeout_ms}ms)",
"retryable": True,
"attempt": attempt + 1,
}
}
if attempt < max_retries:
delay = (base_delay_ms * (2 ** attempt)) / 1000 # exponential backoff
await asyncio.sleep(delay)
# All retries failed; try fallback
if fallback_fn:
try:
return await fallback_fn(args)
except Exception as e:
pass
return last_error or {"status": "error",
"error": {"code": "MAX_RETRIES_EXCEEDED",
"retryable": False}} Model issues tool_calls
│
▼
┌──────────────┐
│ Route / auth │─── Unauthorized ───► 401 + retryable:false ───► no retry, abort
└──────┬───────┘
│ OK
▼
┌──────────────┐
│ Schema check │─── Bad args ───► 400 + fields[] ───► model corrects params and retries
└──────┬───────┘
│ valid
▼
┌──────────────┐
│ Downstream │─── Timeout/5xx ─► retryable:true + retry_after ───► exponential backoff
└──────┬───────┘─── High-risk op ► human_review:true ───► pause, await human confirmation
│ success
▼
{status:"success", data:{...}} ──► next model turn
Governance and evaluation
Use golden call sequences (expected tool names and parameter snapshots) for regression; fuzz edge cases and adversarial strings. When schema changes, update SKILL constraints and record breaking changes.
SKILL review snippet
---
name: tool-use-schema-review
description: Review or design model-facing tool JSON Schema; input: tool definition draft; output: corrected definition + idempotency key code; prohibit: allowing models to pass unvalidated SQL or shell commands
version: "1.0.0"
triggers:
- "design.*tool.*schema|tool.*JSON Schema"
- "agent.*tool.*call|function calling"
steps:
1. Check tool name: verb-first (search_/get_/create_/delete_), single semantic purpose
2. description must state: read/write operation, when to use, when NOT to use
3. All string params must have minLength/maxLength constraints
4. Use enum field for enumerated values, not description text
5. Set additionalProperties: false to prevent model from passing unknown fields
6. Write-operation tools must have idempotency_key param (minLength: 16)
7. High-risk tools (delete_/bulk_) must have environment param (staging/production)
8. Implement server-side re-validation (assert or pydantic); do not rely on model self-constraint
9. Implement call_tool_with_retry (timeout 5s, exponential backoff, max 3 retries)
10. Error return must include code/message/retryable three fields
11. Retryable errors add retry_after_ms; human review errors add human_review: true
12. Do NOT expose internal stack traces in error message; log trace_id server-side
13. Large results (>10KB) return resource_url; do not inline in response body
14. SKILL must mandate "plan before invoke"; forbid chained entity-id guessing
15. Validate tool schema against JSON Schema Draft 7 format in CI
constraints:
- Do NOT execute model-provided strings directly (prevent SQL/command injection)
- Do NOT expose internal paths, stack traces, or key-pattern strings in responses
- Tool registry changes must update all SKILLs that reference the changed tool