Artifact Management & Release Engineering

Project: A Release Process Design

28 min Lesson 10 of 28

Project: A Release Process Design

Everything in this tutorial — semantic versioning, artifact repositories, promotion gates, changelogs, hotfix procedures, release trains — converges here. This capstone lesson walks you through designing a complete, production-grade release process for a realistic product: a multi-service SaaS platform with a public REST API, a web frontend, a mobile SDK, and a CLI tool. You will define versioning, artifacts, and promotion for the entire product line end-to-end, then codify it in runnable configuration.

Why this matters at big-tech scale: A poorly designed release process does not just slow teams down — it creates version skew between services, makes rollbacks impossible, produces untraceable deployments, and breaks customers on SDK upgrades. Google, Stripe, and Netflix publish detailed internal release-engineering specs for exactly this reason. This lesson teaches you to build one.

Step 1: Define the Product Line and Versioning Strategy

Before writing a single pipeline, map every releasable artifact and assign it a versioning scheme. For our example platform — Orion SaaS — the product line is:

  • orion-api — REST API (internally deployed, semver with independent cadence)
  • orion-web — React frontend (deployed to CDN, CalVer YYYY.MM.PATCH for marketing clarity)
  • orion-sdk-go — Public Go module (strict semver, v2+ major versions in module path)
  • orion-cli — CLI binary distributed via Homebrew and GitHub Releases (semver, signed binaries)
  • orion-helm — Helm chart for self-hosted deployments (semver, chart version decoupled from appVersion)

Each artifact has its own version, its own release cadence, and its own artifact repository. Never force all components onto a single unified version number. A breaking change in the CLI must not force a v3.0.0 bump on a stable API that changed nothing.

# Version matrix — codify this in docs/RELEASE.md and enforce in CI # Component Scheme Repository Cadence # orion-api semver ECR (Docker) CD from main # orion-web YYYY.MM.PATCH ECR + S3 (CDN) CD from main # orion-sdk-go semver pkg.go.dev/GitHub Monthly train # orion-cli semver GitHub Releases Monthly train (same as SDK) # orion-helm semver OCI ECR repo Follows API minor version

Step 2: Artifact Definitions

Each component produces a specific artifact type. Define these explicitly so every team member and every pipeline step knows what to build and where to push it.

# orion-api: multi-platform Docker image pushed to ECR # Build-time: inject version as OCI label and Go ldflags docker build \ --label "org.opencontainers.image.version=${VERSION}" \ --label "org.opencontainers.image.revision=${GIT_SHA}" \ --label "org.opencontainers.image.created=$(date -u +%Y-%m-%dT%H:%M:%SZ)" \ --tag "123456789.dkr.ecr.us-east-1.amazonaws.com/orion-api:${VERSION}" \ --tag "123456789.dkr.ecr.us-east-1.amazonaws.com/orion-api:latest" \ --platform linux/amd64,linux/arm64 \ -f docker/api/Dockerfile . # orion-sdk-go: Go module tag — the artifact IS the git tag on a clean commit git tag "v${VERSION}" git push origin "v${VERSION}" GOPROXY=proxy.golang.org go list -m "github.com/orion/orion-sdk-go@v${VERSION}" # orion-cli: cross-compiled binaries with SHA256 checksums, signed with cosign goreleaser release --clean # .goreleaser.yaml defines targets + signing # Produces: dist/orion_linux_amd64, orion_darwin_arm64, etc. # GitHub Release asset list + checksums.txt signed by cosign # orion-helm: push chart as OCI artifact to ECR helm package charts/orion --version "${CHART_VERSION}" --app-version "${APP_VERSION}" helm push orion-${CHART_VERSION}.tgz oci://123456789.dkr.ecr.us-east-1.amazonaws.com/charts
Orion Product Line — Artifact Map orion-api (Git) orion-web (Git) orion-sdk-go (Git) orion-cli (Git) orion-helm (Git) ECR (Docker Image) :1.4.2 / :latest ECR + S3 (CDN) 2026.06.3 pkg.go.dev / GitHub v2.1.0 (git tag) GitHub Releases v1.9.0 + cosign sig ECR OCI (Helm) chart 1.4.0 Promotion Pipeline dev → staging → prod Gate: smoke tests Gate: canary + SLOs Gate: security scan Gate: manual approval (required for major bumps)
Orion product line: five components with distinct artifact types and repositories feeding a shared promotion pipeline.

Step 3: Design the Promotion Pipeline

A promotion pipeline moves an immutable artifact through environments. The artifact itself never changes — the environment tag or Kubernetes value changes. Define your gates explicitly: what must be true for a build to advance?

  • dev — auto-promoted on every merged PR. Unit tests + lint pass. Smoke test suite passes (under 5 min).
  • staging — promoted after dev smoke tests pass. Full integration + contract tests run. Security scan (Trivy, Grype) must return zero high/critical CVEs.
  • canary — for orion-api: 5% of production traffic routed to the new version for 30 minutes. Error rate and p99 latency must not regress vs. baseline (SLO gates evaluated by your observability stack).
  • production — full rollout. Requires manual approval from release engineer for any semver minor or major bump. Patch releases can be auto-promoted after canary passes.
# GitHub Actions promotion job — staging to canary gate # .github/workflows/promote.yaml (excerpt) promote-to-canary: needs: staging-tests environment: canary # GitHub environment with required reviewers runs-on: ubuntu-latest steps: - name: Verify no critical CVEs run: | grype "123456789.dkr.ecr.us-east-1.amazonaws.com/orion-api:${{ env.VERSION }}" \ --fail-on high - name: Update Helm values — canary weight 5% run: | helm upgrade orion-canary oci://123456789.dkr.ecr.us-east-1.amazonaws.com/charts/orion \ --version "${{ env.CHART_VERSION }}" \ --set image.tag="${{ env.VERSION }}" \ --set canary.enabled=true \ --set canary.weight=5 \ --namespace production \ --wait --timeout 5m - name: Run SLO gate (30 min bake) run: | # Query Datadog/Prometheus for error rate delta python scripts/slo_gate.py \ --service orion-api \ --version "${{ env.VERSION }}" \ --window 30m \ --max-error-rate-delta 0.1 \ --max-p99-latency-delta 50

Step 4: Codify the Release Runbook

Every step a human must perform should live in a versioned runbook, not in someone's head. Write it as a RELEASING.md at the repo root. The key sections for each component type:

  • API (CD): Merge PR to main → CI builds and pushes image → ArgoCD/Flux promotes through dev/staging → canary gate → full prod (automatic for patches, manual approval for minor/major).
  • SDK and CLI (monthly train): On the 1st of each month, create release/vX.Y branch from main. Stabilize for one week (fixes only). Tag vX.Y.0 and push. GoReleaser runs automatically on the tag push event.
  • Helm chart: Bump appVersion in Chart.yaml to match the API version being packaged. Bump version for any chart-level changes. Package and push to ECR OCI repo.
Automate version bumps with Conventional Commits + release-please: Install the release-please GitHub Action on each component repo. On every merge to main it reads your conventional commits and opens a Release PR with the correct semver bump, updated CHANGELOG.md, and version file changes. Merging that PR triggers the release. This eliminates manual version decisions for 90% of releases.
# .github/workflows/release-please.yaml — attach to each component repo name: release-please on: push: branches: [main] jobs: release-please: runs-on: ubuntu-latest steps: - uses: google-github-actions/release-please-action@v4 id: release with: release-type: go # or: node, python, ruby, simple token: ${{ secrets.GITHUB_TOKEN }} # Only run on actual release (not just PR update) - name: Build and push artifact if: ${{ steps.release.outputs.release_created }} env: VERSION: ${{ steps.release.outputs.tag_name }} run: | make build-and-push VERSION=${VERSION}

Step 5: Document the Compatibility Matrix

When you have multiple independently-versioned components, you must publish a compatibility matrix so consumers know which versions work together. This is especially critical for the Helm chart, which ships the API image version as its appVersion.

# docs/compatibility.md (machine-readable YAML block for tooling) # orion-compatibility-matrix: # - cli: "1.9.x" # sdk: "2.1.x" # api: ">=1.4.0, <2.0.0" # helm: "1.4.x" # - cli: "1.8.x" # sdk: "2.0.x" # api: ">=1.3.0, <1.5.0" # helm: "1.3.x" # # CI job: validate-compat runs on every release to check matrix coherence # Scripts parse this YAML and run contract tests between declared-compatible pairs
The most common release design failure: teams build a great release pipeline for their main service and leave the SDK, CLI, and Helm chart on ad-hoc manual processes. Six months later a customer files a P1 because CLI v1.7 is incompatible with the API it ships against, and no one knows which version combination is supported. Define the compatibility matrix on day one, before you need it.

What a Complete Release Design Delivers

When you finish this design exercise for your own product, you should be able to answer every one of these questions without looking at a dashboard:

  1. What is the current version of every component in production?
  2. Which git commit is running in production right now?
  3. What changed between the last two production releases?
  4. How do I roll back orion-api to the previous patch version without touching the SDK or CLI?
  5. What is the supported version matrix between the CLI and the API?
  6. Who approved the last production deploy and when?
  7. How do I ship a hotfix to production in the next 30 minutes?

If your release process cannot answer all seven questions, it is incomplete. Build until it can.