Legacy modernization

Without big-bang downtime, use anti-corruption layers, BFFs, and the strangler fig to replace monoliths or stale stacks; agent output should name boundaries, data sync strategy, and business prioritization.

The SKILL starts with inventory: revenue- or compliance-critical capabilities vs deferrable work; rank tech debt by risk and frequency. Isolate external contracts with adapters so new services do not reach through legacy schemas.

Dual-write, backfill, and reconciliation are common pitfalls: define consistency windows, correction jobs, and rollback switches. For event-driven moves, version schemas and plan dead-letter handling.

Organizationally, each strangler slice needs an owner, milestones, and conditions to fall back to the old path; agents can draft ADRs and comms, but cannot decide business scope alone.

Strangler flow (skill-flow-block)

  [ Entry: Facade / API gateway / BFF routes old vs new ]
                    │
                    ▼
         [ New slice: own deploy, contract tests, ACL reads legacy ]
                    │
                    ▼
    [ Progressive traffic: tenant / feature / percentage to new impl ]
                    │
                    ▼
         [ Converge: remove duplication; ban new code hitting old DB directly ]
                    │
                    ▼
  [ End state: legacy read-only or retired; data migrated or synced to new boundary ]

Nginx upstream switch configuration (progressive traffic migration by weight):

# nginx.conf — Strangler fig traffic switching
upstream legacy_orders {
    server legacy-app:8080;
}

upstream new_orders {
    server new-orders-service:3000;
}

# Split by percentage: 90% legacy, 10% new service
upstream orders_split {
    server legacy-app:8080 weight=9;
    server new-orders-service:3000 weight=1;
}

server {
    listen 80;

    # Phase 1: specific tenants switch to new service first (canary)
    location /api/orders {
        # Control canary via Cookie or Header
        set $upstream_orders "legacy_orders";
        if ($http_x_canary = "true") {
            set $upstream_orders "new_orders";
        }
        proxy_pass http://$upstream_orders;
        proxy_set_header Host $host;
    }

    # Phase 2: split by percentage (replace with orders_split)
    # location /api/orders {
    #     proxy_pass http://orders_split;
    # }

    # Phase 3: full cutover to new service (replace with new_orders)
    # location /api/orders {
    #     proxy_pass http://new_orders;
    # }
}

Each ring in the SKILL should name who owns routing toggles, when to cut back to legacy, and acceptance metrics—avoid slices that stall with no owner.

Boundaries and anti-corruption

  • Slice by business capability, not by legacy package or table names.
  • Expose stable contracts outward; translate legacy models through adapters—no leaking old column semantics into new domains.
  • Contract tests cover happy paths and error-code mapping to avoid silent degradation.

Anti-Corruption Layer (ACL) interface code example:

// Legacy system user model (legacy format)
// { usr_id: "123", usr_nm: "John", usr_email: "a@b.com", create_dt: "20240101" }

// New system domain model
interface User {
  id: string;
  name: string;
  email: string;
  createdAt: Date;
}

// ACL Adapter: translates legacy API response to new domain model
class LegacyUserAdapter {
  private legacyClient: LegacyApiClient;

  async getUserById(id: string): Promise<User> {
    const raw = await this.legacyClient.get(`/usr?usr_id=${id}`);
    return this.translate(raw);
  }

  async searchUsers(email: string): Promise<User[]> {
    const raws = await this.legacyClient.get(`/usr/search?email=${email}`);
    return raws.map(this.translate);
  }

  private translate(raw: any): User {
    return {
      id:        raw.usr_id,
      name:      raw.usr_nm,
      email:     raw.usr_email,
      // Legacy date format "20240101" → Date object
      createdAt: new Date(
        raw.create_dt.slice(0,4) + "-" +
        raw.create_dt.slice(4,6) + "-" +
        raw.create_dt.slice(6,8)
      ),
    };
  }
}

// New service only depends on LegacyUserAdapter, never calls legacy API directly
// When legacy API changes, only update Adapter; new service unaffected

Data sync and reconciliation

  • Dual-writes need consistency windows, idempotency, and conflict policy; backfill jobs should be rate-limited, retryable, and pausable.
  • Reconciliation and correction runbooks belong in the plan; define rollback switches and “legacy read-only—breakers.
  • Version event schemas; dead-letter queues and replay policy get their own SKILL section.

Dual-write middleware (write to both systems simultaneously) and data reconciliation script framework:

// Dual-write middleware: write to both legacy and new system
class DualWriteOrderRepository {
  constructor(
    private legacyRepo: LegacyOrderRepository,
    private newRepo: NewOrderRepository,
    private featureFlag: FeatureFlags,
  ) {}

  async save(order: Order): Promise<void> {
    // Always write to legacy (primary)
    await this.legacyRepo.save(order);

    // Write to new system based on feature flag
    if (await this.featureFlag.isEnabled("dual-write-orders")) {
      try {
        await this.newRepo.save(order);
      } catch (err) {
        // New system failure doesn't block main flow, but must alert
        metrics.increment("dual_write.new_system.error");
        logger.error("New system write failed", { orderId: order.id, err });
      }
    }
  }
}

// Data reconciliation script framework (compare new vs legacy for consistency)
async function reconcile(batchSize = 100, offsetDate: Date) {
  let mismatches = 0, checked = 0;

  for await (const legacyBatch of legacyRepo.streamSince(offsetDate, batchSize)) {
    const ids = legacyBatch.map(o => o.id);
    const newBatch = await newRepo.findByIds(ids);
    const newMap = new Map(newBatch.map(o => [o.id, o]));

    for (const legacyOrder of legacyBatch) {
      checked++;
      const newOrder = newMap.get(legacyOrder.id);
      if (!newOrder) {
        logger.warn("Record missing in new system", { id: legacyOrder.id });
        mismatches++;
        continue;
      }
      if (legacyOrder.amount !== newOrder.amount ||
          legacyOrder.status !== newOrder.status) {
        logger.warn("Data inconsistency", { id: legacyOrder.id, legacy: legacyOrder, new: newOrder });
        mismatches++;
      }
    }
  }
  console.log(`Reconciliation complete: checked ${checked}, inconsistencies ${mismatches}`);
  return { checked, mismatches };
}

Organization and slices

  • Per slice: owner, milestones, dependent teams, review/demo cadence.
  • Explicit triggers and steps to revert to the legacy path.
  • ADRs for major boundary and routing decisions; agents draft, business approves scope.

Principles and anti-patterns

  • No big-bang rewrite without approved downtime, backups, and rollback rehearsal.
  • Observability first: metrics, logs, and traces on both old and new paths.
  • Keep slices in small, safe commits—avoid one giant PR that “changes everything.”

SKILL snippet

---
name: legacy-modernization
description: Strangler fig pattern, anti-corruption layer, and dual-write reconciliation
---
# Modernization steps
1. Slice by business capability (not table / package names) and prioritize
2. Design Nginx upstream or API Gateway routing switch
3. Implement ACL Adapter to translate legacy API responses to new domain model

# Progressive traffic strategy
- Phase 1: specific tenants / Header canary (X-Canary: true)
- Phase 2: percentage split (Nginx weight or feature flag)
- Phase 3: full cutover, keep legacy route 30 days as rollback
- Each phase: document rollback conditions and steps

# Data synchronization
- Dual-write middleware: legacy is primary; new system failure doesn't block main flow but must alert
- Backfill job: rate-limited (500 records/sec), pausable, retryable
- Idempotent writes: business ID as primary key, no side effects on duplicate execution

# Data reconciliation
- Daily reconcile: compare record ID / amount / status across new and legacy
- Alert on inconsistency (Slack / PagerDuty)
- Correction runbook: pause dual-write and involve human when inconsistency > 0.1%

# Principles
- Ban big-bang rewrite (without approved maintenance window)
- Observability first: both old and new paths have metrics and traces
- Each slice has owner, milestones, and acceptance criteria

Back to skills More skills