Shell Scripting & Automation

Variables, Quoting & Substitution

18 min Lesson 2 of 28

Variables, Quoting & Substitution

Quoting bugs are the single most common source of broken shell scripts at every level of experience — from intern to principal. A senior SRE at a large tech company will still grep their own scripts for unquoted variables before a production deploy. This lesson teaches you why quoting matters at a fundamental level, how the shell parses your input, and the exact patterns used in production-grade automation.

Variable Assignment and Scope

Assign a variable with NAME=value — no spaces around the =. The shell is sensitive to whitespace here; NAME = value is a command invocation, not assignment. Reference the value with $NAME or the safer ${NAME}. The braces form is required whenever the variable name is followed immediately by alphanumeric characters that would otherwise be parsed as part of the name.

# Assignment — no spaces around =
ENVIRONMENT="production"
REPLICAS=3
LOG_DIR="/var/log/myapp"

# Reference
echo "Deploying to: $ENVIRONMENT"
echo "Replicas: ${REPLICAS}"

# Brace form required here — without braces, shell looks for $LOG_DIRarchive
ARCHIVE_DIR="${LOG_DIR}archive"   # resolves to /var/log/myapparchive — probably wrong
ARCHIVE_DIR="${LOG_DIR}_archive"  # resolves to /var/log/myapp_archive — correct

Scope rule: Variables are local to the current shell session. Child processes (subshells, scripts you call) do not inherit them unless you export the variable first. Use export VARNAME or declare and export in one step: export VARNAME="value". Environment variables passed to Docker, CI systems, and process managers all follow this rule.

The Three Quoting Modes

The shell has three quoting modes, each with a different effect on how the content is interpreted. Misunderstanding this is where most production bugs originate.

The three quoting modes and what each allows the shell to expand.

The production rule is simple: always double-quote variable references unless you have a specific reason not to. Files on real servers frequently contain spaces, tabs, and newlines in their names. An unquoted $filename in a rm or cp command will word-split and match multiple targets — a disaster in production.

FILENAME="my report 2025.csv"

# BUG: word-splits into three arguments: my, report, 2025.csv
cp $FILENAME /backup/

# CORRECT: treats it as a single argument
cp "$FILENAME" /backup/

# Single quotes: $USER is not expanded — prints literal "$USER"
echo 'Hello $USER'

# Double quotes: $USER IS expanded — prints "Hello alice"
echo "Hello $USER"

# Escaping a special character inside double quotes
echo "The cost is \$50"   # prints: The cost is $50

Command Substitution

Command substitution captures the standard output of a command and injects it into an expression. The modern form is $(command). The legacy form uses backticks — avoid backticks in new code because they cannot be nested and are harder to read.

# Capture hostname and date for log file naming (used in every real log rotation script)
HOSTNAME=$(hostname -s)
DATESTAMP=$(date +%Y-%m-%d)
LOGFILE="/var/log/deploy-${HOSTNAME}-${DATESTAMP}.log"

echo "Writing log to: $LOGFILE"

# Nested substitution — modern form handles this cleanly
KERNEL_MAJOR=$(echo "$(uname -r)" | cut -d. -f1)
echo "Kernel major version: $KERNEL_MAJOR"

# Count lines in a file
LINE_COUNT=$(wc -l < /etc/passwd)
echo "Users in passwd: $LINE_COUNT"

# Capture git branch in a deploy script
BRANCH=$(git rev-parse --abbrev-ref HEAD 2>/dev/null || echo "unknown")
echo "Deploying branch: $BRANCH"

Always quote command substitution output. Use "$(command)", not $(command). The output of commands can contain spaces and newlines. A bare $(find . -name "*.log") would word-split every filename with a space in it. Quoting prevents that split.

Arithmetic Expansion

The shell natively supports integer arithmetic via $(( expression )). This is faster than invoking expr (a common legacy pattern) and handles standard operator precedence correctly. For floating-point math, delegate to bc or python3.

# Integer arithmetic — no spaces required but they help readability
TOTAL_INSTANCES=6
INSTANCES_PER_AZ=2
AZ_COUNT=$(( TOTAL_INSTANCES / INSTANCES_PER_AZ ))
echo "Availability zones needed: $AZ_COUNT"   # 3

# Modulo — check if a number is even
BATCH_SIZE=10
REMAINDER=$(( BATCH_SIZE % 3 ))
echo "Remainder: $REMAINDER"   # 1

# Increment a counter (used in retry loops)
RETRY=0
RETRY=$(( RETRY + 1 ))
# Or using the shorthand form inside (( ))
(( RETRY++ ))
echo "Retry count: $RETRY"   # 2

# Floating-point via bc
FREE_PERCENT=$(echo "scale=2; 34 * 100 / 128" | bc)
echo "Free memory: ${FREE_PERCENT}%"   # 26.56%

Special Variables the Shell Sets for You

Several variables are pre-populated by the shell itself. Knowing them removes the need to pass redundant arguments into scripts and is expected knowledge at the SRE level.

$0 — the name of the script itself, useful in usage messages
$? — the exit code of the last command (0 = success)
$$ — the PID of the current shell, useful for creating unique temp file names
$! — the PID of the last background process
$IFS — the Internal Field Separator used for word splitting (default: space, tab, newline)

Production pitfall — modifying IFS. If your script temporarily changes IFS to parse CSV data or split on colons, always restore it afterward. A common pattern is to save the original: OLD_IFS="$IFS", change it, then restore with IFS="$OLD_IFS". Forgetting to restore IFS causes every subsequent word-splitting operation in the script to silently behave wrong — a very hard bug to trace.

Variable Defaults and Defensive Patterns

Production scripts must tolerate missing or empty variables without causing data loss. Bash provides a set of parameter expansion operators for this. At top-tier companies these patterns appear in every automation script because unset variables in a rm -rf command have deleted production data.

# ${VAR:-default} — use default if VAR is unset or empty
ENVIRONMENT="${ENVIRONMENT:-staging}"

# ${VAR:?error message} — abort with error if VAR is unset or empty
# This is the most important safety net in destructive scripts
TARGET_DIR="${TARGET_DIR:?TARGET_DIR must be set — aborting to prevent disaster}"

# ${VAR:+alternate} — use alternate only if VAR is set and non-empty
DEBUG_FLAG="${DEBUG:+--verbose}"   # empty string if DEBUG is unset
curl $DEBUG_FLAG https://api.example.com/health

# Uppercase transformation (Bash 4+)
SERVICE_NAME="my-service"
echo "${SERVICE_NAME^^}"    # MY-SERVICE

# Lowercase
REGION="US-EAST-1"
echo "${REGION,,}"          # us-east-1

# Length of a variable
CONFIG_PATH="/etc/myapp/config.yaml"
echo "Path length: ${#CONFIG_PATH}"   # 24

The combination of ${VAR:?message} at the top of a destructive script and set -euo pipefail (covered in Lesson 8) forms the safety net that distinguishes production-grade scripts from ad-hoc one-liners. Internalize both before you write any automation that touches data.