Prompt engineering

The system prompt sets the ceiling for output quality. This page provides a ready-to-copy four-block template, correct few-shot formatting, fixes for 5 common anti-patterns, and a commit convention for prompt versioning.

A ready-to-copy four-block system prompt template covering 95% of common scenarios:

## ROLE
You are a {specific function}, serving {business context}. Tone: {professional/concise/friendly}. Do not use first-person honorifics.

## TASK
Given the "{input name}" provided by the user, perform the following:
1. {Step 1 — measurable, specific action}
2. {Step 2}
Success criteria: {verifiable output description, e.g.: JSON parseable by JSON.parse and containing an id field}

## CONSTRAINTS
Prohibited: do not invent {fields absent from the data source}; do not output {secrets/internal addresses/PII};
            on tool failure, output TOOL_ERROR: {reason} instead of guessing the result.
If required fields are missing, output MISSING_FIELDS: [{field list}] and do not continue inferring.

## OUTPUT FORMAT
Output only the following JSON structure, with no text outside the Markdown fence:
{
  "result": "string",
  "confidence": "high|medium|low",
  "citations": ["source_id_1", "source_id_2"]
}

Testable: every prompt change should be paired with updated golden output files (covering at least the happy path + 2 edge cases).
Maintainable: record the version number in the prompt repo commit message or in the SKILL front matter's version field.

Hard constraints and boundaries — 5 anti-pattern fixes

The most common prompt errors in production, each with a fix:

❌ Anti-pattern 1: constraints use soft words like "try to" / "should"
Try not to output sensitive information.

✅ Fix: use "prohibited" / "must" + machine-detectable boundary signals
Prohibited: output any string matching /[A-Za-z0-9+/]{40,}={0,2}/ (potential secret key).
If detected, output REDACTED and stop.

❌ Anti-pattern 2: vague role description
You are a helpful AI assistant.

✅ Fix: role includes domain, tone, and prohibited implicit behaviors
You are a B2B pre-sales engineer at Acme Corp; answer only technical questions about Acme products;
do not promise features absent from official documentation; do not disparage competitors.

❌ Anti-pattern 3: output format just says "output JSON"
Please respond in JSON format.

✅ Fix: provide a complete schema + prohibit extra wrapping
Output only the following JSON, no text outside the fence, no comments, no trailing commas:
{"items": [{"id": "string", "score": 0.0}], "total": 0}

❌ Anti-pattern 4: few-shot examples inconsistent with the formal schema
(Examples output loose Markdown; body requires strict JSON)

✅ Fix: example output format must exactly match the body schema
Example → output: {"items": [...], "total": 2}  ← same schema as body

❌ Anti-pattern 5: no handling path for empty input or tool failure
(Silently returns empty or hallucinates)

✅ Fix: explicitly define degraded output formats
On empty input, output: {"error": "EMPTY_INPUT"}
On tool call failure, output: {"error": "TOOL_ERROR", "detail": "{reason}"}

Test: hand the constraints to a colleague—can they write test cases without reading the conversation history? If not, the constraints are not specific enough.

Few-shot correct format

Few-shot examples must cover: ① happy path, ② edge case / empty input, ③ tool failure or ambiguity. Each example is a complete "input → output" pair whose output format exactly matches the body schema.

## EXAMPLES (place in the EXAMPLES block of the system prompt)

Example 1 — happy path
User input: {"order_id": "ORD-9921", "question": "Has it shipped?"}
Expected output: {"answer": "ORD-9921 shipped on 2024-03-12; estimated arrival within 3 days.", "citations": ["order-db/ORD-9921"]}

Example 2 — insufficient info (triggers MISSING_FIELDS)
User input: {"question": "Has my order arrived?"}
Expected output: {"error": "MISSING_FIELDS", "fields": ["order_id"]}
// rationale: no order ID in the question; cannot query; must not guess

Example 3 — tool call failure
User input: {"order_id": "ORD-0001", "question": "What is the shipping status?"}
Tool result: {"status": 503, "message": "Database timeout"}
Expected output: {"error": "TOOL_ERROR", "detail": "Shipping query service temporarily unavailable; please try again later."}

Example 4 — out-of-scope question (triggers refusal)
User input: {"question": "What is the price of your competitor XYZ?"}
Expected output: {"error": "OUT_OF_SCOPE", "suggestion": "Please visit XYZ's official website for pricing."}

Keep 2 core examples in the system prompt; inject the rest via RAG on demand to avoid crowding the instruction region.
Always use placeholders instead of real order IDs, customer names, or secrets in examples.

Structured output — complete schema example

The prompt must provide a complete output schema including field names, types, enums, required fields, and null handling. Never just say "output JSON".

// Define the output schema in the system prompt like this:
Output only the following JSON structure, no text outside the Markdown fence, no comments, no trailing commas:

{
  "status": "success" | "partial" | "failed",   // required; enum
  "items": [
    {
      "id": "string",             // required; original id from retrieval results
      "title": "string",          // required
      "score": number,            // 0.0-1.0; two decimal places
      "snippet": "string | null"  // null when no summary; do NOT omit this field
    }
  ],
  "total": number,                // length of the items array; required
  "error": "string | null"        // required when status is "failed"; otherwise null
}

// Valid example (status=success):
{"status":"success","items":[{"id":"doc-42","title":"Installation Guide","score":0.91,"snippet":"Run npm install..."}],"total":1,"error":null}

// Valid example (status=failed):
{"status":"failed","items":[],"total":0,"error":"Retrieval service timed out; please retry"}

Prohibit trailing commas (not valid JSON) and // comments (not valid JSON).
If downstream uses JSON.parse, specify in the prompt: "Output a JSON object only; the first character must be {".

Before and after — full rewrite example

Same "email reply" task: vague version vs complete four-block version, showing the real difference:

────────────────── Before (3 lines, no structure) ──────────────────
You are an assistant. Help the user write an email reply.
Customer message: {customer_email}
Please give me a professional reply.

Issues: uncertain tone / may invent policies / output not parseable / no failure path

────────────────── After (four-block, testable) ──────────────────
## ROLE
You are Acme Corp B2B support; tone: professional and concise; address the customer as "you" (formal).

## TASK
Draft the reply body (no subject line) based on "customer message" and "knowledge base entries".
Success criteria: reply contains greeting, body, and closing; body addresses the specific issue in the message.

## CONSTRAINTS
- Do not promise refund policies or delivery dates absent from the knowledge base.
- Do not invent order IDs or product model numbers.
- If information is insufficient, the body field outputs: "Please provide the following: [question list, max 3 items]"
- On tool call failure, output {"error": "TOOL_ERROR", "detail": "{reason}"}

## OUTPUT FORMAT
Output only the following JSON, no text outside the fence, no comments, no trailing commas:
{"greeting": "string", "body": "string", "closing": "string"}

Prompt iteration loop — version control conventions

Treat prompt evolution like a code release: every change must map to a reproducible failure sample and acceptance case. Suggested commit message format:

# Prompt commit message format (following Conventional Commits)
prompt(scope): action — failure reason summary

# Real examples:
prompt(email-reply): fix — model generates reply when order_id is missing instead of MISSING_FIELDS
prompt(email-reply): feat — add confidence field to output schema
prompt(email-reply): refactor — rewrite constraints from prose to numbered list for better parse stability
prompt(search): fix — few-shot example schema inconsistent with body causing format drift

# CHANGELOG entry (in the prompt repo CHANGELOG.md or SKILL front matter)
## v1.3.0 — 2024-03-15
### Changed
- section-constraints: add TOOL_ERROR signal format
- section-output: score field type changed from string to number (breaking change)
### Fixed
- few-shot example #2 output format aligned with body schema
### Test
- Add golden cases: empty_input, tool_timeout, out_of_scope

    [ Draft / change vN ]
          │
          ▼
    [ Fixed cases + red-team cases (edge / adversarial) ]
          │
     ┌────┴────┐
     ▼         ▼
 [Parse/business OK?]  [Record failure mode → minimal change]
     │
     ▼
    [ commit: prompt(scope): fix — failure reason ]
          │
          ▼
    [ Update CHANGELOG + golden files ]
          │
          ▼
    [ Publish to repo / runtime config ]

Draft length estimate and templates

The textarea below estimates size: live character count; tokens use the common heuristic "÷4" (mixed CJK/Latin is approximate only). The button cycles three preset templates (four-block / goal-criteria / tool-call), which can be edited here and copied to your prompt repo.

---
name: prompt-engineering-core
description: Draft, review, or iterate system prompts and constraints; output four-block template or fix report; prohibit direct modification of the production prompt repo (must go through PR).
version: 2.1.0
---

# Steps

1. Gather context
   - Confirm: business scenario, target model (GPT-4o / Claude / local), downstream consumption method (JSON.parse / regex / human)
   - Request: existing prompt draft (if any), known failure samples (at least 1)

2. Generate four-block prompt draft
   - Role: {function} + {tone} + {prohibited implicit behaviors}
   - Task: {numbered steps} + {success criteria (measurable)}
   - Constraints: {prohibited items} + {degraded output formats (MISSING_FIELDS / TOOL_ERROR / OUT_OF_SCOPE)}
   - Output format: {complete JSON schema with types/enums/null handling}

3. Review anti-patterns (check each item)
   □ Do constraints use "prohibited/must" rather than "try to/should"?
   □ Are few-shot example formats exactly consistent with the body schema?
   □ Does the output schema cover all three degraded paths: empty input, tool failure, out-of-scope?
   □ Does the schema have trailing commas or comments (invalid JSON)?

4. Output acceptance cases
   - Happy path × 1
   - Empty input / missing fields × 1
   - Tool failure × 1
   - Out-of-scope question × 1

5. Version record
   - commit message format: prompt({scope}): {fix|feat|refactor} — {failure reason summary}
   - Update CHANGELOG.md with version + Changed/Fixed/Test entries

Back to skills More skills