Jenkins & Enterprise CI/CD

Agents & Distributed Builds

18 min Lesson 5 of 28

Agents & Distributed Builds

The Jenkins controller (formerly "master") is the brain of your CI platform — it schedules jobs, manages state, and serves the UI. It should never execute build workloads. Every real build runs on an agent (formerly "slave"), a process that the controller connects to via the Remoting protocol over TCP or WebSocket, giving you safe isolation, horizontal scalability, and the ability to match build environments to job requirements.

Static Agents

A static agent is a persistent machine (VM, bare-metal server, or long-running container) registered in Jenkins under Manage Jenkins → Nodes. The controller SSHes into it (or the agent dials back with agent.jar) and keeps a permanent JNLP/TCP connection open.

Each static node is configured with:

Labels — space-separated tags (linux, docker, gpu-builder, windows). A pipeline's agent { label 'linux && docker' } expression selects matching nodes.
Executors — how many concurrent builds the node accepts (usually 1–2× CPU count).
Root directory — workspace root; use a fast local SSD, not NFS.
Availability — keep-online vs. on-demand (wake on job, disconnect after idle).

Static agents are simple but operationally expensive. You own patching, capacity planning, and cleanup. Reserve them for builds that genuinely need persistent state — e.g., a macOS node for iOS code-signing, or a Windows node for .NET Framework projects.

Launch via SSH (the recommended method): Jenkins opens an SSH connection to the agent host and runs java -jar agent.jar. Ensure the controller's SSH private key is stored in Jenkins Credentials and the agent host is reachable on port 22.

Dynamic Agents

Dynamic agents are provisioned on-demand and destroyed after the build completes. The two dominant back-ends are:

Kubernetes Plugin — spins up a Pod per build, runs the build inside a container, tears it down. This is the standard model for cloud-native Jenkins.
EC2 Plugin / Azure VM Agents / Google Compute Plugin — provisions a cloud VM, runs the build, terminates the instance. Useful when you need full OS-level isolation or heavyweight tools.
Docker Plugin — starts a container on a Docker daemon host per build. Simpler than Kubernetes but ties you to a single Docker host.

Container Agents: The Kubernetes Plugin

At scale, nearly every large engineering org runs Jenkins agents as Kubernetes Pods. The Kubernetes plugin creates a PodTemplate — a Pod spec fragment — for each type of build environment. When a pipeline requests a matching label, the plugin calls the Kubernetes API, the Pod starts, the jnlp container dials back to the controller, and build steps execute inside the specified containers.

Jenkins agent fleet: static nodes for platform-specific builds; Kubernetes Pods provisioned per build for all standard workloads.

A minimal Kubernetes PodTemplate in a declarative pipeline looks like this:

// Declarative Pipeline with Kubernetes agent
pipeline {
    agent {
        kubernetes {
            label 'java-build'
            defaultContainer 'maven'
            yaml """
apiVersion: v1
kind: Pod
spec:
  serviceAccountName: jenkins-agent
  containers:
    - name: jnlp
      image: jenkins/inbound-agent:3256.v88a_f6e922152-1
      resources:
        requests: { cpu: "100m", memory: "256Mi" }
    - name: maven
      image: maven:3.9.6-eclipse-temurin-21
      command: ["sleep", "infinity"]
      resources:
        requests: { cpu: "500m", memory: "1Gi" }
        limits:   { cpu: "2",    memory: "2Gi" }
    - name: kaniko
      image: gcr.io/kaniko-project/executor:v1.23.0
      command: ["sleep", "infinity"]
"""
        }
    }
    stages {
        stage('Build') {
            steps {
                container('maven') {
                    sh 'mvn -B -ntp package -DskipTests'
                }
            }
        }
        stage('Docker Build & Push') {
            steps {
                container('kaniko') {
                    sh '''
                    /kaniko/executor \
                      --context=dir://${WORKSPACE} \
                      --destination=registry.example.com/myapp:${BUILD_NUMBER} \
                      --cache=true
                    '''
                }
            }
        }
    }
}

Use Kaniko, not Docker-in-Docker (DinD), for container image builds inside Kubernetes agents. DinD requires a privileged Pod — a serious security footgun in multi-tenant clusters. Kaniko builds OCI images entirely in userspace without a Docker daemon.

Labeling Strategy at Scale

Labels are how the controller matches jobs to capacity. A coherent labeling taxonomy prevents the "works on my build node" class of failures:

OS / platform — linux, windows, macos-arm64
Runtime — java17, node20, python311
Capability — docker, gpu, large-mem (for linking or ML workloads)
Environment — prod-deploy (restricted nodes with cloud credentials)

In the pipeline you combine labels with boolean operators:

// Run on a Linux node that has both Docker and the python311 runtime
agent { label 'linux && docker && python311' }

// Run on macOS OR windows (cross-platform test matrix)
agent { label 'macos-arm64 || windows' }

PodTemplate Reuse via the Kubernetes Plugin UI or Shared Libraries

Defining yaml: """...""" inline in every Jenkinsfile leads to configuration drift. The production pattern is to define canonical PodTemplate objects either in the Kubernetes plugin's global configuration (Manage Jenkins → Clouds → Kubernetes → Pod Templates) or — better — in a Shared Library as a helper function. Individual pipelines then call agent { label 'java-build' } and pick up the centrally-managed spec.

Controller Isolation: Always Use `agent none`

Never let the top-level pipeline default to running on the controller. Always declare agent none at the top level and assign specific agents to each stage. If a build step crashes or leaks files onto the controller, it can corrupt build metadata, exhaust disk space, and take down the entire CI platform.

pipeline {
    agent none          // controller executes NO build steps
    stages {
        stage('Build') {
            agent { label 'linux && docker' }
            steps { sh 'make build' }
        }
        stage('Deploy') {
            agent { label 'prod-deploy' }
            steps { sh './deploy.sh' }
        }
    }
}

Production Failure Modes

Agent offline during build — job hangs waiting for a reconnect; set JNLP_TIMEOUT and configure retry limits in the cloud plugin.
Workspace accumulation on static agents — old workspaces fill disk; use the Workspace Cleanup plugin and a nightly cleanWs() cron job on each node.
Pod eviction mid-build — Kubernetes may evict a Pod for resource pressure; set Pod priorityClassName: system-cluster-critical for critical pipelines and configure PodDisruptionBudgets on the cluster.
Image pull latency — large agent images (maven:3.9 is 500 MB+) cause cold-start delays; pre-pull images onto nodes with a DaemonSet or use a pull-through registry cache.

Sizing Agents Correctly

Request only what your build actually needs, but set limits to prevent noisy-neighbor issues. Profile your builds with kubectl top pod during a representative run, then set requests at the p50 measured value and limits at the p99. This prevents both under-provisioning (OOMKilled) and over-provisioning (wasted cluster capacity).

Agents & Distributed Builds

Agents & Distributed Builds

Static Agents

Dynamic Agents

Container Agents: The Kubernetes Plugin

Labeling Strategy at Scale

PodTemplate Reuse via the Kubernetes Plugin UI or Shared Libraries

Controller Isolation: Always Use agent none

Production Failure Modes

Sizing Agents Correctly

Controller Isolation: Always Use `agent none`