CI/CD pipeline design

Guide agents to split pipelines into lint, test, build, security scan, artifact promotion, and deploy stages, with fail-fast, cacheable, and retryable steps.

Staged pipeline

Default serial gate order: static checks → automated tests → reproducible build → release to target environments. PR and main pipelines may omit deploy or keep preview-only deploys.

  PR / push
      │
      ▼
   ┌──────┐     ┌──────┐     ┌───────┐     ┌────────┐
   │ lint │ ──► │ test │ ──► │ build │ ──► │ deploy │
   └──────┘     └──────┘     └───────┘     └────────┘
      │             │             │              │
      └─ fail fast  └─ parallel jobs OK  └─ immutable artifacts  └─ blue-green / canary / GitOps

Full four-stage GitHub Actions CI pipeline (lint and test run in parallel; build waits for both):

# .github/workflows/ci.yml
name: CI Pipeline

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

jobs:
  lint:
    name: Lint
    runs-on: ubuntu-24.04
    timeout-minutes: 10
    permissions:
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci --prefer-offline
      - run: npm run lint

  test:
    name: Test
    runs-on: ubuntu-24.04
    timeout-minutes: 20
    permissions:
      contents: read
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: '20'
          cache: 'npm'
      - run: npm ci --prefer-offline
      - run: npm test -- --coverage --ci
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: coverage
          path: coverage/

  build:
    name: Build
    runs-on: ubuntu-24.04
    timeout-minutes: 25
    needs: [lint, test]
    outputs:
      image-tag: ${{ steps.tag.outputs.tag }}
    steps:
      - uses: actions/checkout@v4
      - id: tag
        run: |
          VERSION=$(node -p "require('./package.json').version")
          SHA=$(git rev-parse --short HEAD)
          echo "tag=${VERSION}-${SHA}" >> "$GITHUB_OUTPUT"
      - name: Build Docker image
        run: |
          docker build \
            --label "git.sha=${{ github.sha }}" \
            -t myapp:${{ steps.tag.outputs.tag }} .
      - uses: actions/upload-artifact@v4
        with:
          name: image-tag
          path: /dev/null
          retention-days: 1

  deploy:
    name: Deploy to Staging
    runs-on: ubuntu-24.04
    timeout-minutes: 15
    needs: build
    if: github.ref == 'refs/heads/main'
    environment: staging
    steps:
      - name: Deploy ${{ needs.build.outputs.image-tag }}
        run: |
          echo "Deploying image tag: ${{ needs.build.outputs.image-tag }}"
          # kubectl set image deployment/myapp myapp=myregistry/myapp:${{ needs.build.outputs.image-tag }}

Environment promotion

In the SKILL, spell out who may promote to which environment and which gates are automatic vs manual, so the same pipeline never reaches production under unclear conditions.

Environment Typical trigger Promotion gates Data / config
Dev Branch push, draft PR Lint, fast-path unit tests Synthetic data, placeholder secrets
Staging Merge to main, RC tag Full test suite, security scan, contract tests Redacted snapshots, prod-shaped config
Prod Release approval, version tag Manual gate, canary / blue-green, rollback plan OIDC / short-lived tokens; no long-lived PATs in repo

Operational notes

The SKILL should list required gates vs optional recommendations; differences between main and PR (e.g. preview deploys); and concurrency plus cancel-in-progress to avoid wasted queue time.

If you need immutable artifact tags plus SBOM/signing, document stage order and where artifacts are stored.

Deployment strategy: blue-green, canary, or GitOps sync (e.g. Argo/Flux); document rollback triggers and human approval gates.

  • Cache keys for dependency dirs, Docker layers, Terraform providers.
  • Matrix caps on language versions and OS to avoid combinatorial explosion.
  • Observability: pipeline traces, step duration, failure notification channels.

Artifact versioning strategy (semver + Git SHA) with timeout configuration and failure notification example:

# Versioning: semver + git SHA (used inside build job)
- id: version
  run: |
    VERSION=$(node -p "require('./package.json').version")
    SHA=$(git rev-parse --short=8 HEAD)
    BUILD_DATE=$(date -u +%Y%m%d)
    TAG="${VERSION}+${BUILD_DATE}.${SHA}"
    echo "tag=${TAG}" >> "$GITHUB_OUTPUT"
    echo "semver=${VERSION}" >> "$GITHUB_OUTPUT"

# Parallel job dependency configuration
jobs:
  security-scan:
    needs: build          # waits for build before scanning
    timeout-minutes: 15   # prevents hung scan jobs

  deploy-staging:
    needs: [build, security-scan]  # multi-job dependency

# Failure notification: Slack webhook (add as on-failure step)
  - name: Notify Slack on failure
    if: failure()
    uses: slackapi/slack-github-action@v1.27.0
    with:
      payload: |
        {
          "text": "CI failed on ${{ github.ref }}",
          "blocks": [{
            "type": "section",
            "text": {
              "type": "mrkdwn",
              "text": "*Pipeline failed*: <${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|View run>"
            }
          }]
        }
    env:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
      SLACK_WEBHOOK_TYPE: INCOMING_WEBHOOK

Secrets and credentials

Every secret referenced in the pipeline should map in the SKILL to how it is supplied and who rotates it—not only the variable name.

Secrets: Prefer OIDC federation, workload identity, or short-lived tokens; do not commit long-lived PATs, kubeconfigs, or permanent cloud keys to repos or logs. Inject via the platform’s secret store or sealed secrets; PR pipelines use read-only, least-privilege credentials isolated from production.

Using OIDC to obtain temporary AWS credentials without long-lived access keys:

# OIDC federation: deploy job with no long-lived AWS Access Key
deploy:
  runs-on: ubuntu-24.04
  permissions:
    id-token: write   # must explicitly grant OIDC token
    contents: read
  steps:
    - uses: actions/checkout@v4

    - name: Configure AWS credentials via OIDC
      uses: aws-actions/configure-aws-credentials@v4
      with:
        role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsDeployRole
        role-session-name: github-actions-deploy
        aws-region: us-east-1
        # Trust policy must bind sub: repo:myorg/myrepo:ref:refs/heads/main

    - name: Push image to ECR
      run: |
        aws ecr get-login-password | \
          docker login --username AWS \
            --password-stdin 123456789012.dkr.ecr.us-east-1.amazonaws.com
        docker push 123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:${{ needs.build.outputs.image-tag }}

Stage wall-clock estimate

Roughly estimate full pipeline wall time from typical minutes per stage. When “parallel—is checked, lint and test run in the same wave (max of the two) before build. Values are stored only in this session’s sessionStorage under a page-specific key.

Minutes per stage

Relative to 180-minute reference cap

SKILL snippet

---
name: ci-cd-pipeline-design
description: Design a staged, cacheable CI/CD pipeline with quality gates
tags: [ci-cd, github-actions, devops]
---
# Pipeline Stage Design
- Stage division: lint → test → build → deploy; PR pipelines may prune the deploy stage
- Parallel optimization: lint and test run in parallel; build waits for both (needs: [lint, test])
- Timeout protection: set timeout-minutes on each job to prevent hung runners
- Concurrency control: concurrency.group keyed on branch/PR; main branch must not cancel-in-progress

# Artifacts and Versioning
- Versioning strategy: combine semver (package.json/git tag) with a git SHA short hash
- Immutable artifacts: after push, reference only digest or combined tag—never overwrite latest
- Artifact transfer: pass build outputs between jobs via actions/upload-artifact / download-artifact

# Secrets and Credentials
- Prefer OIDC: AWS/GCP/Azure all support id-token: write + federation; no long-lived keys needed
- Least privilege: tighten top-level permissions: {}, expand only at the job level
- Environment isolation: PR pipelines use read-only credentials; production deploy binds environment: production

# Failure Handling and Notifications
- Fail fast: lint/test failures stop immediately; never reach build/deploy
- Failure notifications: if: failure() + Slack/email webhook including run URL and branch info
- Cache keys: lock-file hash + runner OS; use restore-keys for graceful fallback without cross-env mixing

# Gate Checklist
- Required gates: lint, unit tests, security scan (SAST minimum)
- Optional gates: integration tests, performance benchmarks, bundlesize regression, image scanning
- Approval gate: production deploys require required_reviewers set in the environment configuration

Back to skills More skills