Skip to content

ADR-0039: Multi-Environment Artifact Promotion

Accepted

Platform artifacts (manifest.yaml + policies) need to progress through environments with validation gates and approval workflows:

dev → staging → production
  • REQ-326: Multi-environment promotion workflow
  • REQ-327: Environment-specific tagging strategy
  • REQ-328: Validation gates during promotion
  • REQ-329: Approval workflow (team-based, configurable)
  • REQ-330: Promotion audit trail (immutable log)
  • REQ-331: Rollback mechanism
  • REQ-332: Rollback impact analysis
  • REQ-333: Version pinning and constraints
  • REQ-335: Promotion status dashboard
  • REQ-336: Policy-based promotion (auto-promote when criteria met)
  • REQ-337: Cross-registry promotion (multi-region, DR)
  • REQ-338: Promotion dry-run mode
  • REQ-339: Promotion notifications
  • No standardized promotion workflow
  • Manual artifact copying between environments
  • No validation gates before production
  • No audit trail of what was promoted when
  • Difficult to rollback problematic versions
  • No impact analysis before rollback

GitOps has become the standard delivery model with stronger integrations for policy, security, and cost governance. Industry best practices from 2025 research:

  1. Artifacts built once, promoted everywhere - Same digest across all environments
  2. GitOps-based promotion - PR-based workflows for audit trails
  3. Validation gates - Policy compliance, security scans, tests before promotion
  4. Hybrid approval - Automated for dev/staging, manual for production
  5. Policy-as-code - All promotion rules codified and versioned

Sources:

Adopt GitOps-based promotion workflow with configurable validation gates and flexible approval models.

  1. Artifacts are immutable - Same OCI artifact digest promoted across environments
  2. Promotion is unidirectional - dev → staging → prod (no backward promotion)
  3. Validation gates are configurable - Platform teams define requirements per environment
  4. Approval workflows are external - floe validates and tags, CI/CD handles approvals
  5. Audit trail is complete - Every promotion logged with who, what, when, why

Tag Format: <version>-<environment>

Immutable tags (never change):
- v1.2.3-dev # Promoted to dev
- v1.2.3-staging # Promoted to staging
- v1.2.3-prod # Promoted to production
Mutable tags (updated on promotion):
- latest-dev # Current dev version
- latest-staging # Current staging version
- latest-prod # Current production version

Promotion Flow:

Terminal window
# Step 1: Build and push to dev
floe platform publish --version=v1.2.3
# Creates: v1.2.3-dev, updates latest-dev
# Step 2: Promote to staging (after validation)
floe platform promote v1.2.3 --from=dev --to=staging
# Creates: v1.2.3-staging, updates latest-staging
# Step 3: Promote to production (after approval)
floe platform promote v1.2.3 --from=staging --to=prod
# Creates: v1.2.3-prod, updates latest-prod

Configurable per environment in manifest.yaml:

manifest.yaml
artifacts:
promotion:
gates:
dev:
policy_compliance: true # MANDATORY
tests: false # Optional
security_scan: false # Optional
staging:
policy_compliance: true # MANDATORY
tests: true # Run integration tests
security_scan: true # CVE scan + cosign verify
cost_analysis: false # Optional
production:
policy_compliance: true # MANDATORY
tests: true # Full test suite
security_scan: true # CVE scan + cosign verify
cost_analysis: true # Estimate infrastructure cost
performance_baseline: true # Compare against SLO

Validation Gate Implementations:

GateImplementationExit Code on Fail
policy_compliancefloe platform validate --strict1
testsUser-provided test command1
security_scancosign verify + trivy scan1
cost_analysisInfracost estimateWarn only (0)
performance_baselineUser-provided benchmarkWarn only (0)

floe does NOT manage approvals - delegates to CI/CD:

.github/workflows/promote-to-production.yml
name: Promote to Production
on:
workflow_dispatch:
inputs:
version:
description: 'Version to promote'
required: true
jobs:
promote:
runs-on: ubuntu-latest
environment: production # GitHub environment protection rules
steps:
- name: Run validation gates
run: floe platform promote ${{ inputs.version }} --to=prod --dry-run
# GitHub requires manual approval here (configured in repo settings)
- name: Execute promotion
run: floe platform promote ${{ inputs.version }} --from=staging --to=prod

Recommended Approval Patterns:

  • GitOps PR-based: Create PR to update production environment config, require approvals
  • CI/CD environments: GitHub Environments, GitLab Environments with protection rules
  • External tools: Integrate with PagerDuty, OpsGenie for on-call approval

REQ-331: Rollback to previous version

Terminal window
# Check current production version
floe platform status --env=production
# Output: v1.2.3-prod (deployed 2024-01-15 14:30:00)
# Analyze rollback impact
floe platform rollback v1.2.2 --env=production --dry-run
# Output:
# Affected data products: 12
# Breaking changes: 2 (quality gate threshold increased)
# Estimated downtime: 2 minutes (re-deploy jobs)
# Recommendation: SAFE to rollback
# Execute rollback
floe platform rollback v1.2.2 --env=production
# Creates: v1.2.2-prod-rollback, updates latest-prod

Rollback Impact Analysis (REQ-332):

  • Compare schema versions (detect breaking changes)
  • List affected data products
  • Estimate deployment time
  • Check policy compatibility
  • Provide recommendation (SAFE, RISKY, BLOCKED)

REQ-333: Data products pin to platform versions

# floe.yaml (data product)
platform:
version: ">=1.2.0, <2.0.0" # Semver range (recommended)
# OR
version: "1.2.3" # Exact pin (strict)
# OR
version: "^1.2.0" # Caret range (patch updates)
# OR
version: "latest-prod" # Mutable tag (not recommended)

Version Resolution:

  • Exact pins checked first
  • Semver ranges validated against available versions
  • Mutable tags resolved at runtime (with warning)

REQ-337: Multi-region and DR scenarios

manifest.yaml
artifacts:
registry:
primary:
type: harbor
url: harbor-us-east.acme.com
secondary:
type: ecr
url: 123456789.dkr.ecr.us-west-2.amazonaws.com
tertiary:
type: acr
url: acmedata.azurecr.io
promotion:
cross_registry_sync: true # Sync artifacts to all registries
verification: digest_match # Ensure same digest across registries

Promotion with Multi-Registry:

Terminal window
floe platform promote v1.2.3 --to=prod --sync-registries
# Promotes to production in ALL configured registries
# Verifies digest matches across all registries

REQ-330: Immutable promotion log

Every promotion creates audit record:

{
"promotion_id": "prom-123456",
"artifact": {
"version": "v1.2.3",
"digest": "sha256:abc123..."
},
"source_env": "staging",
"target_env": "production",
"timestamp": "2024-01-15T14:30:00Z",
"promoted_by": "user@acme.com",
"approval": {
"required": true,
"approvers": ["manager@acme.com"],
"approval_timestamp": "2024-01-15T14:25:00Z"
},
"validation_gates": {
"policy_compliance": "PASSED",
"tests": "PASSED",
"security_scan": "PASSED",
"cost_analysis": "PASSED"
},
"result": "SUCCESS"
}

Stored in:

  • OCI registry as annotation
  • OpenTelemetry traces (searchable)
  • Audit log file (append-only)

REQ-335: CLI + web UI visibility

Terminal window
floe platform promote status
Environment Version Promoted At Promoted By Gates
----------- ----------- -------------------- ----------------- ------
dev v1.2.4-dev 2024-01-15 15:00:00 ci-bot@acme.com ✓✓✗✗
staging v1.2.3-stg 2024-01-15 14:45:00 alice@acme.com ✓✓✓✓
production v1.2.3-prod 2024-01-15 14:30:00 bob@acme.com ✓✓✓✓
Pending Promotions:
- v1.2.4-dev staging (blocked: security scan failed)
  • Predictable promotions - Same artifact (digest) across all environments
  • Audit compliance - Complete trail of who promoted what when
  • Safety gates - Catch issues before production (policy, security, tests)
  • Flexible approval - Platform teams choose approval model (GitOps PR, CI/CD, manual)
  • Multi-registry support - DR and multi-region via cross-registry sync
  • Rollback safety - Impact analysis before rollback
  • GitOps aligned - Industry standard workflow (2025)
  • Complexity - More moving parts than manual copying
  • Validation time - Gates add latency to promotion
  • External dependency - Requires CI/CD integration for approvals
  • Cross-registry cost - Bandwidth for multi-registry sync
  • Promotion is unidirectional (prevents accidental downgrades)
  • Validation gates configurable (teams choose safety vs speed)
  • Approval workflow external (floe validates, CI/CD approves)
Terminal window
# Promote artifact
floe platform promote <version> --from=<env> --to=<env> [--dry-run]
# Rollback to previous version
floe platform rollback <version> --env=<env> [--dry-run]
# Check promotion status
floe platform promote status [--env=<env>]
# List available versions
floe platform list --env=<env>
manifest.yaml
artifacts:
promotion:
gates: # Per-environment validation gates
<environment>:
policy_compliance: bool
tests: bool
security_scan: bool
cost_analysis: bool
performance_baseline: bool
notifications:
slack:
webhook: ${SLACK_WEBHOOK}
channels: ["#platform-ops"]
email:
recipients: ["ops@acme.com"]
cross_registry_sync: bool
verification: digest_match | signature_match