Jenkins & Enterprise CI/CD

Agents & Distributed Builds

18 min Lesson 5 of 28

Agents & Distributed Builds

The Jenkins controller (formerly "master") is the brain of your CI platform — it schedules jobs, manages state, and serves the UI. It should never execute build workloads. Every real build runs on an agent (formerly "slave"), a process that the controller connects to via the Remoting protocol over TCP or WebSocket, giving you safe isolation, horizontal scalability, and the ability to match build environments to job requirements.

Static Agents

A static agent is a persistent machine (VM, bare-metal server, or long-running container) registered in Jenkins under Manage Jenkins → Nodes. The controller SSHes into it (or the agent dials back with agent.jar) and keeps a permanent JNLP/TCP connection open.

Each static node is configured with:

  • Labels — space-separated tags (linux, docker, gpu-builder, windows). A pipeline's agent { label 'linux && docker' } expression selects matching nodes.
  • Executors — how many concurrent builds the node accepts (usually 1–2× CPU count).
  • Root directory — workspace root; use a fast local SSD, not NFS.
  • Availability — keep-online vs. on-demand (wake on job, disconnect after idle).
Static agents are simple but operationally expensive. You own patching, capacity planning, and cleanup. Reserve them for builds that genuinely need persistent state — e.g., a macOS node for iOS code-signing, or a Windows node for .NET Framework projects.

Launch via SSH (the recommended method): Jenkins opens an SSH connection to the agent host and runs java -jar agent.jar. Ensure the controller's SSH private key is stored in Jenkins Credentials and the agent host is reachable on port 22.

Dynamic Agents

Dynamic agents are provisioned on-demand and destroyed after the build completes. The two dominant back-ends are:

  • Kubernetes Plugin — spins up a Pod per build, runs the build inside a container, tears it down. This is the standard model for cloud-native Jenkins.
  • EC2 Plugin / Azure VM Agents / Google Compute Plugin — provisions a cloud VM, runs the build, terminates the instance. Useful when you need full OS-level isolation or heavyweight tools.
  • Docker Plugin — starts a container on a Docker daemon host per build. Simpler than Kubernetes but ties you to a single Docker host.

Container Agents: The Kubernetes Plugin

At scale, nearly every large engineering org runs Jenkins agents as Kubernetes Pods. The Kubernetes plugin creates a PodTemplate — a Pod spec fragment — for each type of build environment. When a pipeline requests a matching label, the plugin calls the Kubernetes API, the Pod starts, the jnlp container dials back to the controller, and build steps execute inside the specified containers.

Jenkins Distributed Build Fleet: Controller, Static Agents, and Kubernetes Dynamic Agents Jenkins Controller Scheduler / State / UI Static Agents macOS Node label: macos ios-build Windows Node label: windows dotnet SSH / JNLP Kubernetes Cluster Build Pod jnlp maven:3.9 label: java-build Build Pod jnlp node:20 label: node-build New Pod — provisioned on demand, destroyed after build completes k8s API agent dial-back
Jenkins agent fleet: static nodes for platform-specific builds; Kubernetes Pods provisioned per build for all standard workloads.

A minimal Kubernetes PodTemplate in a declarative pipeline looks like this:

// Declarative Pipeline with Kubernetes agent pipeline { agent { kubernetes { label 'java-build' defaultContainer 'maven' yaml """ apiVersion: v1 kind: Pod spec: serviceAccountName: jenkins-agent containers: - name: jnlp image: jenkins/inbound-agent:3256.v88a_f6e922152-1 resources: requests: { cpu: "100m", memory: "256Mi" } - name: maven image: maven:3.9.6-eclipse-temurin-21 command: ["sleep", "infinity"] resources: requests: { cpu: "500m", memory: "1Gi" } limits: { cpu: "2", memory: "2Gi" } - name: kaniko image: gcr.io/kaniko-project/executor:v1.23.0 command: ["sleep", "infinity"] """ } } stages { stage('Build') { steps { container('maven') { sh 'mvn -B -ntp package -DskipTests' } } } stage('Docker Build & Push') { steps { container('kaniko') { sh ''' /kaniko/executor \ --context=dir://${WORKSPACE} \ --destination=registry.example.com/myapp:${BUILD_NUMBER} \ --cache=true ''' } } } } }
Use Kaniko, not Docker-in-Docker (DinD), for container image builds inside Kubernetes agents. DinD requires a privileged Pod — a serious security footgun in multi-tenant clusters. Kaniko builds OCI images entirely in userspace without a Docker daemon.

Labeling Strategy at Scale

Labels are how the controller matches jobs to capacity. A coherent labeling taxonomy prevents the "works on my build node" class of failures:

  • OS / platformlinux, windows, macos-arm64
  • Runtimejava17, node20, python311
  • Capabilitydocker, gpu, large-mem (for linking or ML workloads)
  • Environmentprod-deploy (restricted nodes with cloud credentials)

In the pipeline you combine labels with boolean operators:

// Run on a Linux node that has both Docker and the python311 runtime agent { label 'linux && docker && python311' } // Run on macOS OR windows (cross-platform test matrix) agent { label 'macos-arm64 || windows' }

PodTemplate Reuse via the Kubernetes Plugin UI or Shared Libraries

Defining yaml: """...""" inline in every Jenkinsfile leads to configuration drift. The production pattern is to define canonical PodTemplate objects either in the Kubernetes plugin's global configuration (Manage Jenkins → Clouds → Kubernetes → Pod Templates) or — better — in a Shared Library as a helper function. Individual pipelines then call agent { label 'java-build' } and pick up the centrally-managed spec.

Controller Isolation: Always Use agent none

Never let the top-level pipeline default to running on the controller. Always declare agent none at the top level and assign specific agents to each stage. If a build step crashes or leaks files onto the controller, it can corrupt build metadata, exhaust disk space, and take down the entire CI platform.
pipeline { agent none // controller executes NO build steps stages { stage('Build') { agent { label 'linux && docker' } steps { sh 'make build' } } stage('Deploy') { agent { label 'prod-deploy' } steps { sh './deploy.sh' } } } }

Production Failure Modes

  • Agent offline during build — job hangs waiting for a reconnect; set JNLP_TIMEOUT and configure retry limits in the cloud plugin.
  • Workspace accumulation on static agents — old workspaces fill disk; use the Workspace Cleanup plugin and a nightly cleanWs() cron job on each node.
  • Pod eviction mid-build — Kubernetes may evict a Pod for resource pressure; set Pod priorityClassName: system-cluster-critical for critical pipelines and configure PodDisruptionBudgets on the cluster.
  • Image pull latency — large agent images (maven:3.9 is 500 MB+) cause cold-start delays; pre-pull images onto nodes with a DaemonSet or use a pull-through registry cache.

Sizing Agents Correctly

Request only what your build actually needs, but set limits to prevent noisy-neighbor issues. Profile your builds with kubectl top pod during a representative run, then set requests at the p50 measured value and limits at the p99. This prevents both under-provisioning (OOMKilled) and over-provisioning (wasted cluster capacity).