Cloud Fundamentals: AWS Core Services

S3 in Depth

22 min Lesson 5 of 30

S3 in Depth

Amazon S3 (Simple Storage Service) is the object-storage backbone of AWS. Every serious production system touches it: static assets, pipeline artifacts, database backups, ML training datasets, audit logs, and Terraform state all live in S3. The API surface looks deceptively simple — PUT, GET, DELETE — but operating S3 correctly at scale requires understanding its consistency model, its cost levers, and the operational hazards that have caused production outages at real companies. This lesson goes deep on the concepts that separate casual users from engineers who design reliable, cost-optimal storage tiers.

Buckets: The Naming and Namespace Rules That Bite Teams

A bucket is a flat-namespace container for objects. Every object has a key (essentially a file path, though S3 has no real directories). A few properties that catch engineers off guard:

Global namespace — bucket names are globally unique across all AWS accounts. my-company-prod-backups is taken by the first AWS account that creates it. Use account-ID or organization prefixes to avoid collisions: acme-123456789012-prod-backups.
Region-bound — despite the global namespace, a bucket physically lives in one region. Data does not leave that region unless you configure Cross-Region Replication (CRR). This matters for GDPR and data-residency compliance.
Strong consistency (since 2020) — S3 now delivers read-after-write consistency for PUTs and DELETEs. The old eventual-consistency edge cases (stale GETs after overwrite) are gone. You no longer need cache-busting retry loops around object reads.
Max object size — 5 TB per object; anything over 100 MB should use the Multipart Upload API, which enables parallel upload for lower latency and recovery from partial failures.

S3 is not a file system and should not be treated as one. Listing objects under a prefix with ListObjectsV2 costs one API call per 1,000 keys. At 10 million objects, that is 10,000 API calls just to enumerate the "directory". Design your key namespace to avoid brute-force listings in hot paths.

Storage Classes: Matching Cost to Access Pattern

S3 offers seven storage classes, each with a different trade-off between storage cost, retrieval cost, and retrieval latency. Choosing the wrong class is the most common cause of unnecessary S3 spend.

S3 storage classes grouped by cost tier. Lifecycle rules automate transitions as objects age and access frequency decreases.

Key classes to know cold:

S3 Standard — 11 nines of durability, stored across three or more AZs, millisecond retrieval. Use for any object accessed more than once a month.
S3 Standard-IA — same durability and AZ redundancy, roughly 46% cheaper storage, but a per-GB retrieval fee applies. Use for objects accessed a few times per year. Minimum storage duration charge: 30 days.
S3 One Zone-IA — stored in a single AZ only; 20% cheaper than Standard-IA. Acceptable only for reproducible data (thumbnails, transcoded video) where AZ loss means regeneration, not permanent data loss.
S3 Intelligent-Tiering — monitors access patterns per object and moves them between tiers automatically with no retrieval penalties. A per-object monitoring fee makes it cost-effective only for objects larger than roughly 128 KB with unknown or variable access patterns.
S3 Glacier Instant Retrieval — for quarterly-or-less access. Millisecond retrieval, roughly 68% cheaper storage than Standard. Ideal for compliance archives that regulators rarely pull.
S3 Glacier Flexible Retrieval — retrieval in minutes to hours. Use for disaster-recovery vaults where you can tolerate a wait.
S3 Glacier Deep Archive — cheapest at roughly $0.00099/GB/month, 12-hour standard retrieval. Regulatory retention of immutable records (7-year audit logs). Treat as write-once-read-almost-never tape replacement in cloud form.

At big-tech scale, the single highest-impact S3 cost optimization is moving CI/CD pipeline artifacts and build caches to Standard-IA after 30 days, then to Glacier after 90 days. Lifecycle rules do this automatically with zero operational overhead. A large monorepo CI system can generate hundreds of terabytes of artifacts per year — proper tiering saves tens of thousands of dollars annually.

Lifecycle Rules: Automated Cost Management

A lifecycle rule is a JSON policy attached to a bucket that transitions objects between storage classes or expires (deletes) them based on age or prefix. This is the primary mechanism for automated cost management at scale. Every bucket that accumulates objects over time should have lifecycle rules.

aws s3api put-bucket-lifecycle-configuration \
  --bucket acme-prod-ci-artifacts \
  --lifecycle-configuration '{
    "Rules": [
      {
        "ID": "artifact-tiering",
        "Status": "Enabled",
        "Filter": { "Prefix": "builds/" },
        "Transitions": [
          { "Days": 30, "StorageClass": "STANDARD_IA" },
          { "Days": 90, "StorageClass": "GLACIER_IR" }
        ],
        "Expiration": { "Days": 365 }
      },
      {
        "ID": "expire-incomplete-multipart",
        "Status": "Enabled",
        "Filter": { "Prefix": "" },
        "AbortIncompleteMultipartUpload": {
          "DaysAfterInitiation": 7
        }
      }
    ]
  }'

The second rule above — AbortIncompleteMultipartUpload — is the one most teams forget. Multipart uploads that fail partway through leave orphaned parts in S3 that you are billed for. This rule cleans them up automatically after 7 days. It belongs on every bucket.

Versioning: Point-in-Time Recovery and the Delete Marker Trap

Enabling versioning on a bucket causes S3 to retain every version of every object rather than overwriting it. This provides accidental-deletion protection and point-in-time recovery. Once enabled, versioning can be suspended but never fully disabled — plan before enabling on a bucket.

Versioning is mandatory for any bucket used as a Terraform state backend, as any corrupted state push without versioning is a permanent loss. It is also mandatory for buckets that store production configuration files or secrets backups.

The most common versioning pitfall is delete markers. When you delete an object in a versioned bucket without specifying a version ID, S3 inserts a delete marker rather than removing the object. The object appears deleted to normal GET requests, but all prior versions (and the marker itself) still exist and accumulate storage costs. To actually delete all versions, you must list versions explicitly and delete each one. Lifecycle rules can automate this: set NoncurrentVersionExpiration to expire old versions after N days, and ExpiredObjectDeleteMarker: true to clean up orphaned delete markers.

# List all versions and delete markers for a key
aws s3api list-object-versions \
  --bucket acme-prod-config \
  --prefix "app/settings.json"

# Permanently delete a specific version
aws s3api delete-object \
  --bucket acme-prod-config \
  --key "app/settings.json" \
  --version-id "3sL4kqtJlcpXroDTDmJ+rmSpXd3dIbrHY+MTRCxf3vjVBH40Nr8X8gdRQBpUMLUo"

# Lifecycle rule to expire noncurrent versions after 90 days
# Add to your lifecycle configuration:
# {
#   "ID": "noncurrent-expiry",
#   "Status": "Enabled",
#   "Filter": { "Prefix": "" },
#   "NoncurrentVersionExpiration": { "NoncurrentDays": 90 },
#   "Expiration": { "ExpiredObjectDeleteMarker": true }
# }

Presigned URLs: Secure Temporary Access Without Credentials

A presigned URL is a time-limited, HMAC-signed URL that grants a bearer access to a private S3 object without requiring AWS credentials. The signature is computed by an IAM principal that has the necessary S3 permissions, and the URL encodes the expiry time. When a client uses the URL, S3 validates the signature and the expiry — if either check fails, S3 returns 403 Forbidden.

This is the canonical pattern for serving private content (user-uploaded documents, generated reports, download links) without proxying bytes through your application tier. Your app generates the URL on the backend and hands it to the client. The client downloads directly from S3. You save egress bandwidth on your application servers and reduce latency.

# Generate a presigned GET URL (valid for 15 minutes = 900 seconds)
aws s3 presign s3://acme-prod-user-uploads/reports/invoice-2024-q4.pdf \
  --expires-in 900

# Output: a long URL the client can use directly in a browser or curl
# https://acme-prod-user-uploads.s3.us-east-1.amazonaws.com/reports/invoice-2024-q4.pdf
#   ?X-Amz-Algorithm=AWS4-HMAC-SHA256
#   &X-Amz-Credential=...
#   &X-Amz-Date=...
#   &X-Amz-Expires=900
#   &X-Amz-Signature=...

# Generate a presigned PUT URL (for direct client uploads — avoids routing file data through your API)
aws s3 presign s3://acme-prod-user-uploads/avatars/user-42.jpg \
  --expires-in 300

# In production, generate presigned URLs with the AWS SDK in your application:
# Python (boto3) example:
# url = s3_client.generate_presigned_url(
#   ClientMethod='get_object',
#   Params={'Bucket': 'acme-prod-user-uploads', 'Key': 'reports/invoice.pdf'},
#   ExpiresIn=900
# )

Presigned URLs are generated by any IAM principal — a role, a user, or an IAM Identity Center session. If that principal's credentials are revoked before the URL expires, AWS does NOT invalidate in-flight presigned URLs (for standard S3). If you need immediate revocation (for compliance or incident response), use S3 Block Public Access policies or switch to using S3 Access Points with tighter controls. For STS-derived credentials (role assumptions), the presigned URL expires when the underlying session expires — whichever is shorter.

Production Failure Modes to Know

S3 is highly reliable but not infallible. These failure modes have caused real outages:

Bucket policy locking out all principals — a malformed Deny statement in a bucket policy can make the bucket permanently inaccessible, including to the AWS account root user. Always test bucket policies with the IAM Policy Simulator before applying them to production buckets, and use AWS Config rules to detect overly permissive or broken policies.
S3 as a single point of failure for application boot — if your EC2 user-data or container startup pulls a config file from S3 and S3 is experiencing elevated error rates (it happens rarely but it does happen), every instance restart in your ASG fails. Cache config at AMI bake time or use local fallbacks.
Hot key partitions — S3 auto-scales to handle high request rates, but it takes time to partition a new prefix. If you suddenly hit a previously cold prefix with thousands of requests per second, you will see elevated 503 responses. Randomize the first few characters of your key names for workloads with high initial burst: sha256(id)[:4]/real-key.
Cross-region data transfer costs — EC2 in us-east-1 reading from S3 in eu-west-1 pays cross-region egress charges. Always verify that your compute and storage are in the same region for high-throughput workloads.

Enable S3 Server Access Logging or S3 Event Notifications (to SQS or Lambda) on every production bucket from day one. Debugging "who deleted that file last week?" without access logs is extremely difficult. Logs are cheap; missing audit trails are expensive.