SKILL authoring
This page provides a complete SKILL.md template (all required fields: name/description/version/triggers/steps/constraints/examples/output_format), trigger condition good-vs-bad comparison code, steps atomicity standards, and a 25+ line production-ready code review SKILL as a full example.
SKILL front matter — complete template
The entry file starts with a YAML fence (two --- lines) around metadata. Here is a complete template with all required fields:
---
name: skill-kebab-name # kebab-case; must match the repo path
description: |
One sentence: when to trigger, what the input is, what the output is, what is prohibited.
Example: trigger when user requests a PR review; input: diff text; output: structured review report; prohibit: directly modifying code.
version: "1.2.0" # semantic version; increment major for breaking changes
triggers:
- "review.*PR|review.*pull request" # regex matching user intent
- "code review"
- "check.*code quality"
steps:
- id: step-1
action: Read diff text; extract list of changed files
- id: step-2
action: For each file check naming conventions, function length, comment coverage
- id: step-3
action: Check security issues (SQL injection, unvalidated input, hardcoded secrets)
- id: step-4
action: Output review results in JSON format
constraints:
- Do not directly modify user code; output suggestions only
- Do not include full copies of user code in the review output
- Each review comment must include line number and specific fix suggestion
examples:
- input: "Please review this PR: [diff content]"
output: '{"issues": [...], "summary": "...", "score": 85}'
output_format:
type: json
schema:
issues: "array of {file, line, severity, message, suggestion}"
summary: "string"
score: "integer 0-100"
---
Trigger conditions: good vs bad comparison
Trigger conditions must be specific enough for regex or keyword matching—vague triggers cause false positives or missed activations.
# ❌ Bad triggers — too vague; any programming question triggers this
triggers:
- "code"
- "help me look at"
- "something's wrong"
# ✅ Good triggers — precisely describe task type, stack, and action
triggers:
- "review.*PR|review.*pull.?request" # explicit action: review
- "check.*code.?quality|code.?quality" # explicit target: quality
- "find.*bug|debug.*code" # explicit intent: debug
- input_contains: ["diff", "patch"] # input feature matching
# ❌ Bad description — no mention of input, output, or boundaries
description: Help users handle code-related issues
# ✅ Good description — includes trigger, input, output, prohibited items
description: |
Trigger when user provides a code diff or PR link requesting review.
Input: unified diff text or GitHub PR URL.
Output: JSON review report (issue list, severity level, fix suggestions).
Prohibited: directly modifying code; including full code copies in output.
- Use a separate regex per trigger; avoid one large regex that is hard to maintain.
- Write constraints using negative phrasing ("Prohibited: X" not "Try not to X").
Steps granularity standard: each step must be an atomic operation
Each step is an independently executable, independently verifiable atomic action. Here is an example of correct granularity for a 5-step code review SKILL:
# ❌ Too coarse — not verifiable; model doesn't know where to start
steps:
- Review the code
- Give feedback
# ✅ Correct granularity — each step is atomic with clear input/output/verify
steps:
- id: step-1
action: Parse diff text; extract changed file list and changed line ranges
input: unified diff string
output: [{file, added_lines, removed_lines}]
verify: list is non-empty; each item contains a file field
- id: step-2
action: For each changed file check function length (flag functions > 50 lines as WARNING)
input: step-1 file list + original code
output: [{file, line, type: "LONG_FUNCTION", message}]
verify: all flagged items have a line field
- id: step-3
action: Check security issues (regex match SQL concatenation, eval(), hardcoded secret patterns)
input: changed line text
output: [{file, line, type: "SECURITY", severity: "ERROR", message}]
verify: severity can only be ERROR/WARNING/INFO
- id: step-4
action: Merge results from step-2 and step-3; sort descending by severity
input: issues lists from step-2 + step-3
output: merged, sorted issues list
- id: step-5
action: Output JSON report; calculate score (100 - ERROR*10 - WARNING*3)
input: step-4 issues list
output: '{"issues": [...], "summary": "...", "score": 0-100}'
verify: score is in range [0, 100]
Complete real SKILL example: code review
A production-ready code review SKILL complete example (25+ lines):
---
name: code-review-pr
description: |
Trigger when user provides code diff or PR content requesting review.
Input: unified diff text.
Output: JSON review report with issue list, severity levels, and fix suggestions.
Prohibited: directly modifying code; copying full source files in output.
version: "2.0.0"
triggers:
- "review.*PR|review.*pull.?request"
- "code.?review"
- input_contains: ["diff", "@@"]
constraints:
- Do not directly modify user-submitted code
- Each issue must contain five fields: file, line, severity, message, suggestion
- severity allows only three values: ERROR / WARNING / INFO
- Output must be valid JSON; do not wrap in a markdown code fence
steps:
- id: parse-diff
action: Parse unified diff; extract changed file list and changed line ranges per file
output: "list[{file: str, hunks: list[{start, lines}]}]"
- id: check-style
action: |
For each changed line check:
1. Function > 50 lines → WARNING: LONG_FUNCTION
2. Line length > 120 chars → INFO: LONG_LINE
3. Missing docstring (public function) → WARNING: MISSING_DOCSTRING
output: "list[Issue]"
- id: check-security
action: |
Scan changed lines with these patterns:
- SQL concatenation: r"[\"'].*(SELECT|INSERT|UPDATE).*[\"']\s*\+" → ERROR
- eval/exec call: r"\beval\s*\(|\bexec\s*\(" → ERROR
- Hardcoded secret: r"(api_key|secret|password)\s*=\s*[\"'][^\"']{8,}" → ERROR
output: "list[Issue]"
- id: check-tests
action: Check whether changed functions have a corresponding test file; if src/foo.py changed but tests/test_foo.py is unchanged, output WARNING: MISSING_TEST_UPDATE
output: "list[Issue]"
- id: aggregate
action: Merge all issues; sort descending by severity (ERROR > WARNING > INFO)
output: "list[Issue] sorted"
- id: score-and-report
action: |
Calculate score: score = max(0, 100 - len(ERROR)*10 - len(WARNING)*3 - len(INFO)*1)
Output complete JSON report
output: |
{
"issues": [{"file": "...", "line": 42, "severity": "ERROR",
"message": "...", "suggestion": "..."}],
"summary": "Found 2 ERRORs, 3 WARNINGs",
"score": 74
}
examples:
- input: |
--- a/src/user.py
+++ b/src/user.py
@@ -10,3 +10,4 @@
def get_user(id):
+ query = "SELECT * FROM users WHERE id = " + id
return db.execute(query)
expected_issues:
- {file: "src/user.py", line: 11, severity: "ERROR", message: "SQL string concatenation is vulnerable to injection"}
---
[ Align name / description with trigger phrasing ]
|
v
[ Write front matter with all required fields ]
|
v
[ Atomic steps (each with action/output/verify) ]
|
v
[ constraints using negative phrasing (Prohibited: X) ]
|
v
[ examples with input/expected that are verifiable ]
Front matter draft lab
Edit a draft in the text area: the left checklist reflects common front matter checks; the right side is a read-only preview highlighting YAML keys at line start and Markdown headings (for scanning—edit on the left).
Front matter checklist
Key and heading preview
The checklist is heuristic (keywords and fences); it does not replace platform schema validation.
---
name: skill-authoring-guide
description: Write a spec-compliant SKILL.md file; input: skill requirements description; output: complete SKILL.md; prohibit: writing vague trigger conditions or non-atomic steps
version: "1.0.0"
triggers:
- "write.*skill|create.*skill|author.*skill"
- "how to write SKILL.md"
steps:
1. Extract from requirements: trigger scenario, input type, expected output, prohibited behaviors
2. Write name (kebab-case) and description (one sentence each: trigger/input/output/prohibited)
3. Write triggers (each regex covers one phrasing variant; at least 2)
4. Break down steps: each step is one atomic action; specify action/input/output
5. Write constraints: all in negative form ("Prohibited: X", not "Try not to X")
6. Write at least 1 example (with input and expected_output)
7. Write output_format (if type: json, include schema description)
8. Check if triggers overlap with existing SKILLs (refine to be more specific if they do)
9. Check that each step is independently verifiable (split further if not)
10. Set version: "1.0.0"; increment major for breaking changes
11. Validate front matter format with yamllint in CI
12. When committing, also update any upstream rule files that reference this SKILL
constraints:
- Do NOT write "try to", "may consider", "recommend" or other vague phrasing
- Do NOT combine multiple actions into one step
- Do NOT use real secrets or production data in examples