CI/CD pipeline design
Guide agents to split pipelines into lint, test, build, security scan, artifact promotion, and deploy stages, with fail-fast, cacheable, and retryable steps.
Staged pipeline
Default serial gate order: static checks → automated tests → reproducible build → release to target environments. PR and main pipelines may omit deploy or keep preview-only deploys.
PR / push
│
▼
┌──────┐ ┌──────┐ ┌───────┐ ┌────────┐
│ lint │ ──► │ test │ ──► │ build │ ──► │ deploy │
└──────┘ └──────┘ └───────┘ └────────┘
│ │ │ │
└─ fail fast └─ parallel jobs OK └─ immutable artifacts └─ blue-green / canary / GitOps
Full four-stage GitHub Actions CI pipeline (lint and test run in parallel; build waits for both):
# .github/workflows/ci.yml
name: CI Pipeline
on:
push:
branches: [main]
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
jobs:
lint:
name: Lint
runs-on: ubuntu-24.04
timeout-minutes: 10
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci --prefer-offline
- run: npm run lint
test:
name: Test
runs-on: ubuntu-24.04
timeout-minutes: 20
permissions:
contents: read
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- run: npm ci --prefer-offline
- run: npm test -- --coverage --ci
- uses: actions/upload-artifact@v4
if: always()
with:
name: coverage
path: coverage/
build:
name: Build
runs-on: ubuntu-24.04
timeout-minutes: 25
needs: [lint, test]
outputs:
image-tag: ${{ steps.tag.outputs.tag }}
steps:
- uses: actions/checkout@v4
- id: tag
run: |
VERSION=$(node -p "require('./package.json').version")
SHA=$(git rev-parse --short HEAD)
echo "tag=${VERSION}-${SHA}" >> "$GITHUB_OUTPUT"
- name: Build Docker image
run: |
docker build \
--label "git.sha=${{ github.sha }}" \
-t myapp:${{ steps.tag.outputs.tag }} .
- uses: actions/upload-artifact@v4
with:
name: image-tag
path: /dev/null
retention-days: 1
deploy:
name: Deploy to Staging
runs-on: ubuntu-24.04
timeout-minutes: 15
needs: build
if: github.ref == 'refs/heads/main'
environment: staging
steps:
- name: Deploy ${{ needs.build.outputs.image-tag }}
run: |
echo "Deploying image tag: ${{ needs.build.outputs.image-tag }}"
# kubectl set image deployment/myapp myapp=myregistry/myapp:${{ needs.build.outputs.image-tag }}
Environment promotion
In the SKILL, spell out who may promote to which environment and which gates are automatic vs manual, so the same pipeline never reaches production under unclear conditions.
| Environment | Typical trigger | Promotion gates | Data / config |
|---|---|---|---|
| Dev | Branch push, draft PR | Lint, fast-path unit tests | Synthetic data, placeholder secrets |
| Staging | Merge to main, RC tag | Full test suite, security scan, contract tests | Redacted snapshots, prod-shaped config |
| Prod | Release approval, version tag | Manual gate, canary / blue-green, rollback plan | OIDC / short-lived tokens; no long-lived PATs in repo |
Operational notes
The SKILL should list required gates vs optional recommendations; differences between main and PR (e.g. preview deploys); and concurrency plus cancel-in-progress to avoid wasted queue time.
If you need immutable artifact tags plus SBOM/signing, document stage order and where artifacts are stored.
Deployment strategy: blue-green, canary, or GitOps sync (e.g. Argo/Flux); document rollback triggers and human approval gates.
- Cache keys for dependency dirs, Docker layers, Terraform providers.
- Matrix caps on language versions and OS to avoid combinatorial explosion.
- Observability: pipeline traces, step duration, failure notification channels.
Artifact versioning strategy (semver + Git SHA) with timeout configuration and failure notification example:
# Versioning: semver + git SHA (used inside build job)
- id: version
run: |
VERSION=$(node -p "require('./package.json').version")
SHA=$(git rev-parse --short=8 HEAD)
BUILD_DATE=$(date -u +%Y%m%d)
TAG="${VERSION}+${BUILD_DATE}.${SHA}"
echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
echo "semver=${VERSION}" >> "$GITHUB_OUTPUT"
# Parallel job dependency configuration
jobs:
security-scan:
needs: build # waits for build before scanning
timeout-minutes: 15 # prevents hung scan jobs
deploy-staging:
needs: [build, security-scan] # multi-job dependency
# Failure notification: Slack webhook (add as on-failure step)
- name: Notify Slack on failure
if: failure()
uses: slackapi/slack-github-action@v1.27.0
with:
payload: |
{
"text": "CI failed on ${{ github.ref }}",
"blocks": [{
"type": "section",
"text": {
"type": "mrkdwn",
"text": "*Pipeline failed*: <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|View run>"
}
}]
}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK
Secrets and credentials
Every secret referenced in the pipeline should map in the SKILL to how it is supplied and who rotates it—not only the variable name.
Using OIDC to obtain temporary AWS credentials without long-lived access keys:
# OIDC federation: deploy job with no long-lived AWS Access Key
deploy:
runs-on: ubuntu-24.04
permissions:
id-token: write # must explicitly grant OIDC token
contents: read
steps:
- uses: actions/checkout@v4
- name: Configure AWS credentials via OIDC
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsDeployRole
role-session-name: github-actions-deploy
aws-region: us-east-1
# Trust policy must bind sub: repo:myorg/myrepo:ref:refs/heads/main
- name: Push image to ECR
run: |
aws ecr get-login-password | \
docker login --username AWS \
--password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:${{ needs.build.outputs.image-tag }}
Stage wall-clock estimate
Roughly estimate full pipeline wall time from typical minutes per stage. When “parallel—is checked, lint and test run in the same wave (max of the two) before build. Values are stored only in this session’s sessionStorage under a page-specific key.
Minutes per stage
SKILL snippet
---
name: ci-cd-pipeline-design
description: Design a staged, cacheable CI/CD pipeline with quality gates
tags: [ci-cd, github-actions, devops]
---
# Pipeline Stage Design
- Stage division: lint → test → build → deploy; PR pipelines may prune the deploy stage
- Parallel optimization: lint and test run in parallel; build waits for both (needs: [lint, test])
- Timeout protection: set timeout-minutes on each job to prevent hung runners
- Concurrency control: concurrency.group keyed on branch/PR; main branch must not cancel-in-progress
# Artifacts and Versioning
- Versioning strategy: combine semver (package.json/git tag) with a git SHA short hash
- Immutable artifacts: after push, reference only digest or combined tag—never overwrite latest
- Artifact transfer: pass build outputs between jobs via actions/upload-artifact / download-artifact
# Secrets and Credentials
- Prefer OIDC: AWS/GCP/Azure all support id-token: write + federation; no long-lived keys needed
- Least privilege: tighten top-level permissions: {}, expand only at the job level
- Environment isolation: PR pipelines use read-only credentials; production deploy binds environment: production
# Failure Handling and Notifications
- Fail fast: lint/test failures stop immediately; never reach build/deploy
- Failure notifications: if: failure() + Slack/email webhook including run URL and branch info
- Cache keys: lock-file hash + runner OS; use restore-keys for graceful fallback without cross-env mixing
# Gate Checklist
- Required gates: lint, unit tests, security scan (SAST minimum)
- Optional gates: integration tests, performance benchmarks, bundlesize regression, image scanning
- Approval gate: production deploys require required_reviewers set in the environment configuration