Backstage & Service Catalogs
Backstage & Service Catalogs
Spotify open-sourced Backstage in 2020 after building it internally to tame 2,000+ microservices, 1,600+ engineers, and hundreds of infrastructure components that had no single pane of glass. The platform became the de-facto standard for Internal Developer Platforms (IDPs) and graduated as a CNCF Incubating project in 2022. At its core, Backstage is three loosely coupled pillars: the Software Catalog, Software Templates, and TechDocs. This lesson covers exactly those three pillars at the depth required to operate them in production.
The Software Catalog
The catalog is a living registry of every entity your organization owns: services, libraries, websites, pipelines, APIs, resources (S3 buckets, RDS clusters), systems, and domains. Each entity is described by a YAML file — called a catalog descriptor — that lives alongside the code it describes.
Every descriptor follows a common schema with apiVersion, kind, metadata, and spec. The metadata.annotations block is where Backstage plugins read their configuration — PagerDuty service IDs, Datadog dashboard links, GitHub Actions workflow paths, ArgoCD app names, and so on.
Backstage discovers descriptors through catalog providers. The GitHub provider can ingest every catalog-info.yaml across all repositories in an org in minutes. The URL provider handles one-off registrations. At scale, most teams configure auto-discovery so that creating a new repo with a catalog-info.yaml automatically registers the entity within the next sync cycle (default: 5 minutes).
spec.owner field maps to a Group or User entity. If the referenced owner does not exist in the catalog, Backstage marks the component with an orphaned warning. Keep your Group descriptors in a dedicated org/ repository synced from your IdP (Okta, Azure AD, Google Workspace) via the relevant catalog provider — this is the canonical source of truth for org structure.
The catalog's power multiplies through relations. When a component declares dependsOn, Backstage builds a bidirectional graph. On the component's page, engineers instantly see upstream/downstream dependencies, the owning team's on-call schedule, the last 72 hours of incidents, recent deployments, and open pull requests — all aggregated from plugins reading those annotations. This is the full-context view that eliminates the "where is this thing documented?" question.
Software Templates (Scaffolder)
Templates are the mechanism behind golden paths. An engineer selects a template, fills in a short form, and Backstage creates a repository pre-configured with your company's CI pipeline, Dockerfile, Helm chart, Datadog monitors, PagerDuty service, GitHub branch protection rules, and catalog-info.yaml — all wired up and ready for the first commit. This is sometimes called Day-0 automation.
Templates are themselves YAML descriptors with kind: Template. They declare an input schema (the form fields), a set of steps that the Backstage scaffolder executes server-side, and an output block linking to the created resources.
scaffolder-templates repository, not inside the Backstage app repo. Reference them with absolute URLs (url: https://github.com/acme-corp/scaffolder-templates/tree/main/go-microservice/skeleton). This lets platform teams iterate on templates without redeploying Backstage, and teams can pin to a specific tag for stability.
TechDocs
TechDocs solves the "docs rot in Confluence" problem by treating documentation as code. Engineers write Markdown in the repo (using MkDocs as the generator), CI publishes the rendered HTML to object storage (S3 or GCS), and Backstage serves it inline on the component's Docs tab. The annotation backstage.io/techdocs-ref: dir:. in the catalog descriptor tells Backstage where to find the mkdocs.yml.
The TechDocs builder can run in two modes: local (Backstage builds docs on-demand) or external (CI builds and publishes to storage). Production deployments must use external mode — local mode blocks the Backstage Node process and cannot scale.
techdocs.builder: 'external' in app-config.yaml. Pair this with a CDN in front of the storage bucket to cut doc load times from ~800 ms to under 100 ms.
Production Deployment Considerations
At scale, Backstage is a Node.js application that can be resource-hungry. Key production settings to tune:
- Catalog refresh interval: default 5 minutes; increase to 15–30 minutes for orgs with 5,000+ entities to reduce GitHub API rate-limit pressure.
- Database: replace the default in-memory store with PostgreSQL (required for any deployment beyond a single pod).
- Authentication: integrate with your IdP (Okta, GitHub OAuth, Azure AD) using Backstage's auth backend. Guest access is disabled in production.
- Plugin isolation: each Backstage plugin runs in the same Node process. A misbehaving plugin (e.g., one with an unhandled promise rejection) can crash the entire app. Pin plugin versions and use
liveness/readinessprobes to catch and restart quickly.
catalog-info.yaml), template adoption (what percentage of new repos were created via a golden-path template), and TechDocs coverage (services with published docs). These three metrics are the most actionable indicators of IDP health and developer experience ROI.
The Software Catalog, Software Templates, and TechDocs are the foundation of everything else in Backstage. Once entities are registered and docs are live, every other plugin — cost visibility, security posture, deployment frequency, on-call load — simply reads entity annotations and enriches the same single pane of glass. The compounding effect is what justifies the infrastructure investment.