Pair programming guide

Define roles when humans pair with agents: who holds the keyboard, who checks invariants and tests; short goals and explicit handoffs reduce context loss.

A SKILL can structure sessions: align task boundaries and acceptance first; the driver implements small slices while the navigator (human or agent) watches naming, edge cases, and consistency with existing patterns. Below: a driver/navigator loop diagram.

Roles

The driver owns the edit focus: types code, runs commands, triggers tests—keeping steps small and pausing at verifiable checkpoints. The navigator does not grab the keyboard (until rotation) and continuously checks team norms: readability, error handling, coupling with neighboring modules.

Rule: one steering wheel at a time. If the driver tunnels, the navigator interrupts to shrink the goal—not to dump a wall of PR feedback later.

Driver (holds the keyboard):
  ✓ Open editor, type concrete code
  ✓ Run tests: npm test / pytest / go test ./...
  ✓ Execute git add / commit (after navigator confirms)
  ✓ Keep each step ≤ 10 lines; each step passes or fails clearly
  ✗ No architecture decisions; no skipping test verification

Navigator (does not hold the keyboard):
  ✓ Aloud: check that naming matches repo conventions
  ✓ Flag missing error-handling branches and edge cases
  ✓ Verify coupling direction against existing modules
  ✓ Record disagreements (see decision rules below)
  ✓ Hand off in one sentence before rotation: "Next: X in file Y; risk is Z"
  ✗ Do not rewrite code in parallel while the driver is typing

Rotation every 25 minutes (Pomodoro variant):
  [0–25 min] Driver implements → Navigator navigates
  [25 min]    5-min switch: hand off context + commit current increment
  [30–32 min] 2-min retro: what was achieved this round, goal for next round
  [32–57 min] New driver takes over (former navigator becomes driver)

Rotation cadence

After each runnable increment (green tests, or a single failing test tied to this goal), consider swapping roles to reduce fatigue and bias. Hotfixes can compress to short “you type, I review—rounds—still verbalize change surface and rollback points.

  [ Align goal + acceptance criteria ]
                    │
                    ▼
            ┌───────────────┐
            →   DRIVER     → Types, runs tests, small commits
            └───────┬───────┘
                    │
                    ▼
            ┌───────────────┐
            →  NAVIGATOR   → Naming / edges / consistency with patterns
            └───────┬───────┘
                    │
                    ▼
            [ Runnable increment done? ]
                    │
         ┌──────────┴──────────┐
         │ no                  │ yes
         ▼                    ▼
  [ Narrow scope or add info ]   [ Rotate roles or log decisions ]

Hand off in one sentence: “Next, in file X we implement Y; risk Z.”
Long sessions benefit from timeboxing so both people don’t dive into the weeds.

Agent as navigator

Agents work well as navigators when they ask for stack traces, file excerpts, or test output instead of rewriting blindly. Human drivers should voice assumptions (“I think this hits cache first”) so the model can correct course rather than hallucinate agreement.

A real Driver (human) + Agent (navigator) collaboration session:

Driver: "I'm adding coupon validation to the checkout API; writing validateCoupon now"

Agent (navigator): "Before you call coupon.findOne():
  1. What do you return when findOne returns null?
  2. Have you considered a user applying the same coupon twice on the same order?"

Driver: "First: return { valid: false, reason: 'NOT_FOUND' }.
        Second: skip for now."

Agent: "OK to skip, but add a TODO so it's trackable:
  // TODO(JIRA-456): prevent reuse—same user/coupon can only apply once
  Future PRs can find it."

Driver: "Done. Writing tests now—"

Agent: "Write the failing test first, then implement:
  it('should reject invalid coupon code', async () => {
    const result = await validateCoupon('INVALID', orderId);
    expect(result.valid).toBe(false);
    expect(result.reason).toBe('NOT_FOUND');
  });
  Run it red, then write the implementation. TDD order."

Driver: "Got it—test is red. Writing implementation."
(25 minutes later)
Driver: "Implementation done, tests green, ready to commit"

Agent: "Commit message: feat(checkout): add coupon validation with NOT_FOUND handling
        Then swap roles—you're navigator next round."

Constrain agents: output should be applyable in slices (small patches or ordered steps)—not a giant unreviewable diff; ask for missing paths or symbols before writing.

When the agent temporarily drives (e.g. scaffolding), the human should navigate and explicitly verify compile, types, and directory layout at each step.

Disagreement decision rules

When a disagreement arises:
1. Navigator raises the objection; driver pauses; each states their case in one sentence
2. Classify the disagreement:
   a. Technical fact (e.g. "this will leak memory") → write a test; let data decide
   b. Style / naming → check existing repo conventions; if none, driver decides and logs a TODO
   c. Architecture / design → out of scope for this round; log as an ADR draft;
      use the simplest implementation for now
3. Final decision: the driver has final say for this round
4. Record format (in git commit body or PR description):
   [DECISION] Chose option A over B; reason: XXX
   [DEFERRED] Option C deferred pending confirmation in JIRA-789

Incremental git commit format

# Commit after each Pomodoro; format:
# <type>(<scope>): <short description> (under 50 chars)
#
# [optional body: what was implemented / decision record]

# Example:
feat(auth): add JWT token refresh endpoint

- Implemented POST /api/auth/refresh
- Expired token returns 401, invalid signature returns 403
- [DECISION] Used RS256 instead of HS256 to support multi-service verification

test(auth): add unit tests for token refresh

- Covers normal refresh, expired token, and invalid signature branches
- Uses jest.useFakeTimers to simulate token expiry

# Formats to avoid:
# git commit -m "fix stuff"        ← not traceable
# git commit -m "WIP"              ← cannot revert to a meaningful state
# git commit -m "updated many files"  ← no semantic meaning

PR and review hygiene

Pairing should help async reviewers: PR descriptions capture pairing outcomes—options rejected, rationale, open follow-ups. Link ADRs or issues for deep refactors so reviewers don’t re-debate decisions already made at the keyboard.

Run baseline tests and types before pairing on large changes; the PR shows only the delta.
Track follow-ups with checklists or TODOs (with owners)—not only chat history.

Reviewer-friendly: if pairing produced clean commits, keep them single-purpose so reviewers can read in commit order instead of one opaque squash.

---
name: pair-programming
description: Human–agent pair roles and rotation cadence
---
# Session start (run before each pairing session)
1. State the round goal in one sentence: "This round: implement POST /api/coupon/validate, cover NOT_FOUND and EXPIRED paths"
2. State acceptance criteria: "npm test green, types pass, ≥ 2 test cases"
3. Assign roles: Driver = human, Navigator = Agent

# Driver rules (each step)
4. Write test (RED first) → implement (GREEN) → refactor (REFACTOR)
5. ≤ 10 lines per step; run tests after each step
6. When stuck: narrate current state aloud; let Navigator intervene

# Navigator checklist (after each step)
7. Does naming match repo conventions? (check patterns in src/utils/)
8. Any missing error branches? (network timeout / DB exception / concurrency)
9. New external dependency introduced? (needs deps review)
10. Do tests assert observable behavior (not implementation details)?

# 25-min Pomodoro rotation steps
11. [25 min] Driver commits current increment: git add -p && git commit -m "feat: ..."
12. [+5 min] Hand-off narration: "I implemented X; known issue is Y; next step is Z"
13. [+2 min] Retro: what did this round achieve; where does next round start
14. Swap roles: former driver becomes navigator; former navigator becomes driver

# Disagreement decisions
15. Technical fact → write test; let data decide
16. Style disagreement → check repo conventions; driver decides if none
17. Architecture disagreement → use simplest implementation; record [DECISION] comment for follow-up ADR

# Session end
18. Push branch; check CI status
19. Write "pairing outcomes" in PR description: options rejected, open TODOs

Back to skills More skills