Bug reproduction & minimal cases

From vague symptoms to stable repro: capture environment, versions, inputs, and timing; bisect away unrelated code or data; end with a failing test or script you can submit.

A SKILL should ask first: is it 100% reproducible, which versions are affected, does it correlate with load or data size. Then guide fixing random seeds, disabling caches, or recording request sequences.

Deliverables should be machine-runnable minimal slices: unit test, one-line curl, or docker-compose snippet, with expected vs actual. Fix PRs should include the case to prevent regression.

Reproduction flow (skill-flow-block)

  [ Collect: symptom, repro rate, env fingerprint (OS / runtime / lockfiles) ]
        │
        ▼
  ┌─────────────┐     Fix: seed, clock, cache, concurrency, data snapshot
  │ Stable path  │──── Record: request sequence, config diff, feature flags
  └─────────────┘
        │
        ▼
  ┌─────────────┐     Remove modules/lines/data: mark still repros / gone
  │ Bisect scope │──── Code: comment branches, stub downstream; data: halve sets
  └─────────────┘
        │
        ▼
  ┌─────────────┐     Unit assert, script, compose snippet; expected vs actual
  │ Minimal artefact │──── PR: failing test first; must stay green after fix
  └─────────────┘

Bug report standard format (ready to submit to an issue tracker):

## Bug report

**Title**: Checkout page price formatting shows NaN in dark mode

### Environment
- OS: macOS 14.3 / Windows 11 22H2
- Browser: Chrome 122.0.6261.112
- App version: v3.2.1 (commit: abc1234)
- Node.js: 20.11.0
- Repro rate: 100% (in the environment above)

### Steps to reproduce
1. Visit http://localhost:3000 and confirm dark mode is on (system settings or UI toggle)
2. Add any item to the cart
3. Click "Checkout" and proceed to the checkout page
4. Observe the order total amount displayed

### Expected behavior
Order total shows the correct amount, e.g. $299.00

### Actual behavior
Order total shows "NaN". Console error:
```
TypeError: Cannot read properties of undefined (reading 'toLocaleString')
    at formatPrice (checkout.ts:142:23)
    at CheckoutSummary.render (CheckoutSummary.tsx:89:12)
```

### Minimal reproduction
```bash
git clone https://github.com/org/repo && cd repo
git checkout v3.2.1
npm ci && npm run dev
# Open http://localhost:3000, enable dark mode, visit /checkout
```

### Already tried
- [ ] Cannot reproduce in light mode (renders correctly)
- [ ] Clearing browser cache does not help
- [ ] Cannot reproduce on v3.2.0 (regression introduced in this version)

### Related issues / PRs
Possibly related to #1089 (dark mode currency formatting)

If unreproducible, list hypotheses tried and next observations; use synthetic data when privacy matters. Pair with stack trace analysis after you can repro.

MRE build steps

Shrink the whole app to the smallest surface that still hits the same root cause: prefer a failing test (npm test / pytest single case), else a script or minimal HTTP request; avoid “works on my machine” verbal steps only.

Descend from integration/E2E to unit when possible; for multi-service use docker-compose or contract doubles to shrink externals.
Each step is executable and numbered: copy-paste reaches the bad state; document fixtures and seed data.
Expected vs actual: one line for “should be”, then actual output, log snippet, or screenshot placeholder.

MRE build checklist (progressively narrow scope):

# MRE build checklist

## Phase 1: stabilize reproduction
# □ Record repro rate (every time / intermittent / specific conditions)
# □ Fix environment variables (random seed, timezone, locale)
export TZ=UTC
export LANG=en_US.UTF-8
node --experimental-vm-modules --seed=42 test.js

# □ Disable caches (HTTP cache, app cache)
curl -H "Cache-Control: no-cache" http://localhost:3000/api/checkout

# □ Record minimal trigger conditions (which params reproduce, which do not)

## Phase 2: narrow scope (bisection)
# □ Descend from integration tests to unit tests
# Checking formatPrice behavior in dark mode:
# Original issue is in CheckoutSummary component
# → Extract formatPrice logic as an isolated test

# test/formatPrice.test.ts
import { describe, it, expect } from 'vitest'
import { formatPrice } from '../src/utils/currency'

describe('formatPrice', () => {
  it('formats valid price', () => {
    expect(formatPrice(299, 'USD')).toBe('$299.00')  // PASS
  })

  it('handles undefined currency in dark mode context', () => {
    // Reproduce bug: currency is undefined in dark mode context
    const darkModeContext = { theme: 'dark', currency: undefined }
    expect(formatPrice(299, darkModeContext.currency)).toBe('$299.00')  // FAIL: NaN
  })
})
# Run: npx vitest run test/formatPrice.test.ts

## Phase 3: anonymize production data (for test fixtures)
# Export anonymized sample from production DB (no real user data)
psql $PROD_DB -c "
  SELECT
    id,
    ROUND(total_amount, 2) as amount,   -- preserve amount structure
    currency_code,
    'test-user-' || floor(random()*1000) as user_id,  -- anonymize
    created_at::date as date            -- date only
  FROM orders
  WHERE created_at > NOW() - INTERVAL '7 days'
  LIMIT 100;
" > test/fixtures/orders-sample.csv

Docker Compose multi-service repro environment:

# docker-compose.repro.yml
# Use to reproduce the bug in an isolated environment
# Usage: docker compose -f docker-compose.repro.yml up
version: "3.9"

services:
  # Pin the version where the bug was introduced
  app:
    image: org/payment-app:v3.2.1     # pinned version (not latest)
    environment:
      DATABASE_URL: postgresql://postgres:postgres@db:5432/app_test
      THEME: dark                       # key config that triggers the bug
      NODE_ENV: production
    ports:
      - "3001:3000"
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:15-alpine
    environment:
      POSTGRES_DB: app_test
      POSTGRES_PASSWORD: postgres
    healthcheck:
      test: ["CMD-SHELL", "pg_isready -U postgres"]
      interval: 5s
      timeout: 5s
      retries: 5
    volumes:
      - ./test/fixtures/seed.sql:/docker-entrypoint-initdb.d/seed.sql

# Usage:
# 1. docker compose -f docker-compose.repro.yml up -d
# 2. curl http://localhost:3001/api/checkout/summary
# 3. Check whether the response contains NaN
# 4. docker compose -f docker-compose.repro.yml down

Descend from integration/E2E to unit when possible; for multi-service use docker-compose or contract doubles to shrink externals.
Expected vs actual: one line for "should be", then actual output, log snippet, or screenshot placeholder.

Bisection and git bisect

Input and data bisection: for many files or parameters, remove half each time and check if it still repros to find a minimal necessary set. For time-related bugs, bisect time windows or request batches.

# git bisect automation example
# Known: v3.2.0 is good, v3.2.1 has the bug
git bisect start
git bisect bad HEAD          # mark current version as bad
git bisect good v3.2.0       # mark known good version

# Automated bisect script (return 0=good, 1=bad, 125=skip)
cat > /tmp/bisect-test.sh << 'BEOF'
#!/bin/bash
npm ci --silent 2>/dev/null || exit 125  # skip if build fails
npm test -- --testPathPattern="formatPrice" --silent
BEOF
chmod +x /tmp/bisect-test.sh

# Run automatic bisect (roughly log2(commits) rounds)
git bisect run /tmp/bisect-test.sh
# Output: abc1234 is the first bad commit
git bisect reset  # restore working tree

git bisect: between known good and bad commits, binary search for the introducing change; with an automated script (exit 0 good, 125 skip, else bad) it can run unattended. Record each round; avoid bisecting on dirty trees.

Ensure the bad commit builds and tests run on mainline, or bisect will be blocked by compile failures.
Merge commits: use git bisect --first-parent to skip merge points; after cherry-pick hotfixes, re-mark good/bad boundaries.

Repro steps Markdown builder

Fill the fields and choose sections to generate Markdown for issues, PRs, or chat; data stays in localStorage in the browser only—nothing is uploaded.

Environment Steps (numbered list) Expected behavior Actual behavior Notes / already tried

Title (optional, level-1 heading) Environment (one item per line: OS, runtime, deps, branch/commit) Steps (one line per step; auto-numbered) Expected behavior Actual behavior Notes / already tried (when “Notes” is checked)

Empty sections are omitted; check at least one section and fill its body. Blank lines in steps are ignored. Add logs, screenshots, or failing test paths manually after generation.

---
name: bug-reproduction
description: From report to minimal reproducible case
model: claude-sonnet-4-5
---

# Required info (ask if missing)
required_info:
  - repro rate (100% / intermittent / specific conditions)
  - affected version range (first bad version)
  - OS / runtime / dependency versions (lockfile)
  - whether it correlates with load / data size / concurrency

# MRE build steps
mre_steps:
  1. Stabilize reproduction (pin random seed, clock, caches)
  2. Descend from integration test to unit test
  3. Bisect scope (code / data / config)
  4. Redact production data for test fixtures
  5. Docker Compose isolation for multi-service repro

# Deliverable standards
deliverables:
  - failing test (npm test / pytest single case runs red)
  - or: minimal curl command / docker-compose snippet
  - include expected vs actual output
  - fix PR must include the case (prevent regression)

# Privacy data handling
data_handling:
  - production data: anonymize (randomize user_id, keep structure)
  - PII fields: replace with synthetic data
  - forbidden: paste real user data in public issues

Back to skills More skills