Every real script makes decisions. A backup script must abort when the target disk is full. A deploy script must refuse to run on the wrong branch. A health-check script must page on-call when a service returns a non-200 status. All of these hinge on one construct: the conditional. In this lesson you will master Bash conditionals deeply enough to write production-grade decision logic.
How Bash Evaluates Conditions
Bash conditionals are built on exit codes, not boolean values. Every command returns an integer between 0 and 255. Exit code 0 means success (true); anything else means failure (false). The if keyword simply runs a command and branches on its exit code.
# The simplest possible if: branch on any command's exit code
if ping -c1 -W1 8.8.8.8 >/dev/null 2>&1; then
echo "Network is up"
else
echo "Network is down"
fi
# Check the last command's exit code manually
grep -q "ERROR" /var/log/app.log
if [ $? -ne 0 ]; then
echo "No errors found"
fi
# Idiomatic form — test inline, no $? needed
if grep -q "ERROR" /var/log/app.log; then
echo "Errors detected — alerting on-call"
fi
The golden rule: Always test exit codes, not output. Parsing command output is fragile; testing whether a command succeeded is robust. When you write if command; then you are testing the command\'s exit code directly — no subshell, no variable, no quoting hazard.
[ ] versus [[ ]] — Know the Difference
[ ] is a POSIX command (/usr/bin/[). [[ ]] is a Bash keyword. In a Bash script, always use [[ ]]. It is not available in /bin/sh, but for scripts with #!/usr/bin/env bash in the shebang it is the right tool for every job.
The practical differences matter in production:
Word splitting:[ $var = "yes" ] explodes if $var contains spaces or is empty. [[ $var == "yes" ]] never word-splits, even without quotes.
Pattern matching:[[ $branch == release/* ]] matches glob patterns natively — no quoting needed on the right-hand side.
Regex matching:[[ $line =~ ^[0-9]+$ ]] supports full POSIX ERE. The regex must NOT be quoted.
Logical operators:[[ $a == "x" && $b == "y" ]] uses && and || inside the brackets. With [ ] you must use -a and -o, which are deprecated and ambiguous.
#!/usr/bin/env bash
set -euo pipefail
branch="release/2.4.1"
version="42"
filename="report 2024.csv"
# --- String tests with [[ ]] ---
# Equality and inequality
[[ "$branch" == "main" ]] && echo "on main"
[[ "$branch" != "main" ]] && echo "not on main"
# Glob matching — no quotes around the pattern
[[ "$branch" == release/* ]] && echo "release branch detected"
# Regex matching — regex MUST NOT be quoted
[[ "$version" =~ ^[0-9]+$ ]] && echo "version is numeric"
# Empty / non-empty string
[[ -z "$branch" ]] && echo "branch is empty"
[[ -n "$branch" ]] && echo "branch is set"
# Safe even with spaces — [[ ]] never word-splits
[[ "$filename" == *.csv ]] && echo "CSV file detected"
# --- Logical combinators ---
[[ "$branch" == release/* && -n "$version" ]] && echo "release with version"
File Tests
DevOps scripts constantly interrogate the filesystem — checking whether a config file exists before reading it, whether a directory is writable before deploying into it, whether a socket is present before sending traffic. Bash provides a full suite of file-test operators:
CONFIG="/etc/app/config.yaml"
LOG_DIR="/var/log/app"
SOCKET="/run/gunicorn.sock"
BINARY="/usr/local/bin/deploy-tool"
# Existence tests
[[ -e "$CONFIG" ]] && echo "exists (file or dir)"
[[ -f "$CONFIG" ]] && echo "is a regular file"
[[ -d "$LOG_DIR" ]] && echo "is a directory"
[[ -L "$CONFIG" ]] && echo "is a symlink"
[[ -S "$SOCKET" ]] && echo "is a socket"
# Permission tests
[[ -r "$CONFIG" ]] && echo "readable"
[[ -w "$LOG_DIR" ]] && echo "writable"
[[ -x "$BINARY" ]] && echo "executable"
# Size test
[[ -s "$CONFIG" ]] && echo "non-empty file"
# Practical guard: abort if config is missing
if [[ ! -f "$CONFIG" ]]; then
echo "ERROR: config not found at $CONFIG" >&2
exit 1
fi
# Practical guard: create log dir if absent
if [[ ! -d "$LOG_DIR" ]]; then
mkdir -p "$LOG_DIR"
chmod 750 "$LOG_DIR"
fi
Numeric Comparisons
String equality operators (==, !=) do lexicographic comparison. For numbers use arithmetic operators: -eq, -ne, -lt, -le, -gt, -ge. Alternatively, use an arithmetic context (( )) which treats values as integers and returns exit code 0 for non-zero (true) and 1 for zero (false).
disk_usage=87
max_allowed=80
http_status=503
retry_count=3
# With [[ ]] and arithmetic flags
if [[ "$disk_usage" -gt "$max_allowed" ]]; then
echo "ALERT: disk usage ${disk_usage}% exceeds threshold" >&2
fi
# With (( )) — arithmetic context, cleaner for math
if (( http_status >= 500 )); then
echo "Server-side error: $http_status"
fi
if (( retry_count == 0 )); then
echo "No retries left — failing deployment"
exit 1
fi
# Compound: disk and inodes both OK
used_inodes=$(df -i /var | awk 'NR==2{print $5}' | tr -d '%')
if (( disk_usage < 90 && used_inodes < 90 )); then
echo "Disk health OK"
fi
Use (( )) for arithmetic conditions. It is cleaner, supports C-style operators (++, %, **), and avoids the visual noise of -gt/-lt. Just be aware that (( 0 )) returns exit code 1 (falsy) — useful when counting retries down to zero.
The case Statement
When a single variable drives multiple branches, a chain of elif blocks is harder to read and maintain than a case statement. case supports glob patterns, alternation with |, and a catch-all *). It is the idiomatic way to dispatch on environment names, log levels, HTTP methods, or CLI subcommands.
#!/usr/bin/env bash
set -euo pipefail
ENV="${1:-}" # first argument, empty if not provided
HTTP_METHOD="POST"
# --- Dispatch on environment name ---
case "$ENV" in
production|prod)
REPLICAS=10
LOG_LEVEL="warn"
;;
staging|stage)
REPLICAS=2
LOG_LEVEL="info"
;;
development|dev|"")
REPLICAS=1
LOG_LEVEL="debug"
;;
*)
echo "ERROR: unknown environment '$ENV'" >&2
echo "Usage: $0 {production|staging|development}" >&2
exit 1
;;
esac
echo "Deploying with $REPLICAS replicas, log level=$LOG_LEVEL"
# --- Dispatch on HTTP method (useful in webhook handlers) ---
case "$HTTP_METHOD" in
GET|HEAD) echo "Read-only request" ;;
POST|PUT|PATCH) echo "Mutating request — validate body" ;;
DELETE) echo "Destructive request — require confirmation" ;;
*) echo "Unknown method: $HTTP_METHOD"; exit 1 ;;
esac
Always include a *) catch-all. A case statement with no catch-all silently does nothing when the value does not match — a common source of bugs where a typo in the environment name causes a deploy to run with no configuration at all. Make the catch-all error loudly and exit non-zero.
Diagram: Conditional Decision Flow in a Deploy Script
A deploy script chains multiple conditional guards before reaching the environment dispatch.
Combining Conditions: Real Patterns
Production scripts rarely test a single condition. Here is a realistic health-check function that combines file tests, numeric comparisons, and string tests in the pattern you will see in large-scale infrastructure code:
#!/usr/bin/env bash
set -euo pipefail
APP_PID_FILE="/run/app/app.pid"
APP_URL="http://localhost:8080/healthz"
MAX_RESPONSE_MS=500
check_health() {
local pid status elapsed
# 1. PID file must exist and be non-empty
if [[ ! -s "$APP_PID_FILE" ]]; then
echo "CRITICAL: PID file missing or empty" >&2
return 1
fi
pid=$(cat "$APP_PID_FILE")
# 2. Process must actually be running
if ! kill -0 "$pid" 2>/dev/null; then
echo "CRITICAL: process $pid is not running" >&2
return 1
fi
# 3. HTTP health endpoint must return 200
read -r status elapsed <<< "$(curl -o /dev/null -s \
-w "%{http_code} %{time_total_ms}" \
--max-time 2 "$APP_URL")"
if [[ "$status" != "200" ]]; then
echo "CRITICAL: healthz returned HTTP $status" >&2
return 1
fi
# 4. Response must be fast enough
if (( elapsed > MAX_RESPONSE_MS )); then
echo "WARNING: healthz slow — ${elapsed}ms > ${MAX_RESPONSE_MS}ms" >&2
return 1
fi
echo "OK: app healthy (HTTP $status, ${elapsed}ms)"
}
check_health || { notify_oncall "app health check failed"; exit 1; }
Production pattern — guard functions, not scripts. Wrap each check in a function that returns a meaningful exit code. The caller decides whether to page on-call, retry, or hard-exit. This separation of concerns is how SRE teams write runbooks that can be tested independently.